Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automation contains a full user archiver automation docker compose service. #138

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
aa401bb
feat: Add index.py
bitdeep Sep 8, 2024
536e819
fix: Force colorama to use ANSI escape sequences in SSH Docker console
bitdeep Sep 8, 2024
25cae52
feat: Add support for colorized console output and human-readable fil…
bitdeep Sep 8, 2024
75b562c
refactor: move session check to separate function
bitdeep Sep 8, 2024
812769b
feat: create index.html with links to each group's generated HTML
bitdeep Sep 8, 2024
60731e9
feat: Add Bootstrap and improve index.html formatting
bitdeep Sep 8, 2024
e67e289
feat: add message sending to self for tg-archive errors and alerts
bitdeep Sep 8, 2024
003d895
feat: detect and set MY_USERNAME from TelegramClient instance
bitdeep Sep 8, 2024
668180e
feat: Improve group processing output and disable unnecessary logging
bitdeep Sep 8, 2024
1837257
fix: Handle Jinja2 UndefinedError in tg-archive build
bitdeep Sep 8, 2024
87bc949
feat: Add automation index.py
bitdeep Sep 8, 2024
1ba566e
feat: Add time tracking to log_id
bitdeep Sep 8, 2024
13a7181
feat: Add get_log_id function and update run_tg_archive
bitdeep Sep 8, 2024
4b225e9
fix: Add datetime import to resolve undefined name error
bitdeep Sep 8, 2024
9b86656
refactor: Improve the formatting and readability of the get_log_id fu…
bitdeep Sep 8, 2024
38ba0df
feat: Add UTF-8 icons to log output
bitdeep Sep 8, 2024
d68f42a
feat: Move print statements with colorama to new color functions
bitdeep Sep 8, 2024
852d850
feat: Pass start time to print functions in run_tg_archive
bitdeep Sep 8, 2024
2503eda
fix: Remove unused variable `group_name` from `run_tg_archive` function
bitdeep Sep 8, 2024
a22833c
refactor: Refactor get_log_id to pass the group and use values from it
bitdeep Sep 8, 2024
9d12ac3
feat: move out the while process.poll to another function to show stats
bitdeep Sep 8, 2024
323891d
feat: add log file writing to show_process_stats
bitdeep Sep 8, 2024
485431a
chore: remove sync_log and build_log code, log to console
bitdeep Sep 8, 2024
1fd2353
feat: add free space information to show_process_stats
bitdeep Sep 8, 2024
ddfbf5d
fix: Update imports and formatting in automation/index.py
bitdeep Sep 8, 2024
bcaaabf
feat: Add disk usage color indicator in process status
bitdeep Sep 8, 2024
5d0883f
refactor: Refactor show_process_stats to display disk space and statu…
bitdeep Sep 8, 2024
7b478d0
feat: Add progress indicator for group processing
bitdeep Sep 8, 2024
e27db46
feat: Add time tracking for group processing
bitdeep Sep 8, 2024
3715b23
feat: Add index.html generation after processing groups
bitdeep Sep 8, 2024
5a25f1f
feat: Add check for generated index.html file before adding group to …
bitdeep Sep 8, 2024
7a4c521
feat: Add last update time and directory size to index.html
bitdeep Sep 8, 2024
1207310
feat: Move HTML strings to template files
bitdeep Sep 8, 2024
0b03331
feat: Add support for skipping specific message IDs
bitdeep Sep 8, 2024
314896b
feat: Add functionality to skip specific message IDs based on configu…
bitdeep Sep 8, 2024
3137c69
feat: Add average processing time and estimated completion time to "T…
bitdeep Sep 8, 2024
bb11354
added support for full user group archive automation
bitdeep Sep 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.aider*
data
session
.vscode
.env
3 changes: 3 additions & 0 deletions automation/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
data
session

6 changes: 6 additions & 0 deletions automation/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
API_ID=11111111
API_HASH=43432424242424242424242424242424
UID=1000
GID=1000
TZ=America/Sao_Paulo
SESSION_ID=tg-archiver-session
24 changes: 24 additions & 0 deletions automation/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM redhat/ubi9-minimal:9.4-1227
RUN microdnf install -y --nodocs python3 python3-pip procps-ng findutils && \
microdnf clean all && \
pip-3 install 'PyYAML>=5.4.1' && \
pip-3 install 'pytz==2023.3.post1' && \
pip-3 install 'tg-archive==1.1.3' && \
pip-3 install 'colorama' && \
pip-3 install 'humanize' && \
pip-3 install 'telethon'
CMD ["sh", "-c", "sleep infinity"]
WORKDIR /app
ENV HOME=/app
ARG UID
ARG GID
ENV TZ=${TZ} UID=${UID} GID=${GID}
COPY . /app
RUN chmod +x /app/index.sh
RUN groupadd -g ${GID:-1000} app && \
useradd -u ${UID:-1000} -g app app && \
mkdir -p /app /data && \
chown -R app:app /app /data
VOLUME ["/session"]
USER app
CMD ["/app/index.sh"]
21 changes: 21 additions & 0 deletions automation/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License

Copyright (c) 2021, Kailash Nadh. https://nadh.in

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
40 changes: 40 additions & 0 deletions automation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
## How it works

tg-archive uses the [Telethon](https://github.com/LonamiWebs/Telethon) Telegram API client to periodically sync messages from ALL your non-archived groups to a local SQLite database (file), downloading only new messages since the last sync. It then generates a static archive website of messages to be published anywhere.

## Features

- 📁 Extra feature: scan all your non-archived groups and archive them.
- 🔄 Periodically sync Telegram group messages to a local DB.
- 🖼️ Download user avatars locally.
- 📥 Download and embed media (files, documents, photos).
- 📊 Renders poll results.
- 😀 Use emoji alternatives in place of stickers.
- 📝 Single file Jinja HTML template for generating the static site.
- 📅 Year / Month / Day indexes with deep linking across pages.
- 🔗 "In reply to" on replies with links to parent messages across pages.
- 📰 RSS / Atom feed of recent messages.

## Install

- Get [Telegram API credentials](https://my.telegram.org/auth?to=apps). Normal user account API and not the Bot API.
- If this page produces an alert stating only "ERROR", disconnect from any proxy/vpn and try again in a different browser.
- Copy `example.env` to .env
- Create the directionr `session` in same compose dir.
- Copy any generated session generated to `./session/session.session`. If you don't have a session, just run the container and the enter it and run tg-archive to generate a new one.

- Inside the container: `python /usr/local/bin/tg-archive --new --path=session`
- Ensure that the `session.session` file is generated in the as `/session/session.session` directory. This file contains the API authorization for your account.
- Then, after 30 seconds the script should auto-start.

### Customization

Edit the generated `./data/*/template.html` and static assets in the `./data/*/static` directory to customize the site group.

### Note

- The sync can be stopped (Ctrl+C) any time to be resumed later.
- Setup a cron job to periodically sync messages and re-publish the archive.
- Downloading large media files and long message history from large groups continuously may run into Telegram API's rate limits. Watch the debug output.

Licensed under the MIT license.
15 changes: 15 additions & 0 deletions automation/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
services:
tg-archive:
container_name: tg-archive
build: .

volumes:
- ./data:/data
- ./session:/session
environment:
- TZ=${TZ}
- API_ID=${API_ID}
- API_HASH=${API_HASH}
- UID=${UID}
- GID=${GID}
- TERM=xterm-256color
Loading