-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hosting questions #120
Comments
This depends on what you want to do with CM, here's my ballpark: System Specs1 bot, 1 subreddit, no image processing
For each additional subreddit (regardless # of bots) "add" 5MB free memory, UP TO 500MB total (base + additional) CM can work with more or less memory. The docker image targets 512MB but this can modified. Generally, less memory with "more" subreddits than the above recommendation results in a slower bot as node has to free up memory more often but it will still work! With image processing Image processing requires holding uncompressed image data in memory while it is manipulated. Add an additional 100-200MB+ of free memory on top of the base memory depending on usage of image processing. Database SpecsDatabase should be determined largely by event volume. Using sqlite is fine if total volume is very low. So could be running 20 subreddits as long as total aggregate volume for all subs is < 50/hour. Note: If volumes is higher than this a dedicated database like mysql or postgres may be better. Though better-sqlite3 should be fine for all but the highest volumes I think performance is better, in general, when using a dedicated database. |
The operator is whoever is running the actual CM instance. Side note: this can be different for the client and server CM instances but by default it is the same. Specifying an operator reddit account is necessary in order for CM to know who (reddit account) is authorized to create new bots in the instance. It also gives the operator a more permissive dashboard -- operators can see all subreddits, logs, and configs in the dashboard. (They cannot change configs without guest access though).
An instance, in this context, is the server component of CM that will run the actual bot. A CM client (web interface) can connect to multiple, independent CM server instances. In the default configuration there is only one client, one instance. I should probably hide that field in the setup screen when there is only one instance available. |
Cache
This is basically correct! I have plans to make "subset" requests fetchable from cache eventually.
Yes, redis is not included in the CM docker image. It can be wired together with docker-compose or some other external service.
They are cached for one minute, by default.
Data expires to avoid stale cache. Here are all the things CM tries to cache from reddit and their TTLs Cache expires because user history, submission/comment state (stickied, reports, etc...), and other data from reddit changes.
Setting cache TTLs too low would prevent CM from being able to reuse data and previous processing results which would force CM to make additional API calls to reddit. For example, if you set Setting cache TTLs too high increases the probability of stale data in the cache which could cause CM to incorrectly process an activity. For example, if you set
The caching defaults are not storage related but for preventing stale data usage. The defaults are intentional -- they provide CM a reasonable amount of time to reuse cache with a low probability of stale data. They can, however, be changed per subreddit :) For instance if you have rules that only ever look at a user's initial history, but may have to do it repeatedly, you can safely increase caching:
authorTTL: 600 # cache user history for 10 minutes
modNotesTTL: 300 # cache mod notes for a user for 5 minutes
runs:
# ... |
Database
The docker image uses sqlite by default and will do automatic backups and migrations for you. If you switch to mysql/postgres you can force migrations to run automatically like this (in the operator config): databaseConfig:
migrations:
force: true # always run migrations However CM also has a migration UI! If you start CM and it detects it requires a migration that cannot be done automatically you can visit the "dashboard" to get a migration confirmation page that lets you execute the migration. Other
I think CM should be logging to file by default. Check your cm data folder for a logging:
# default level for all logging
level: debug
file:
# override default level
level: warn
# true -> log folder at DATA_DIR/logs
# /home/myUser/logs -> absolute location of folder (remember this is in the container if using docker!)
# ./myLogs -> relative location from DATA_DIR
dirname: true
Code used by a docker imageThe CM code in a docker image is pinned to docker images tags that mirror release versions, the master branch ( If you use a release tag, EX If you use a branch tag ( Upgrade processCM is designed to not need the database or cache to operate. The database is basically for keeping track of statistics and Actioned Events for viewing prior bot history. If you removed the database and cache (or provided brand new instances of both) CM will happily use them. So migrating a database is only necessary to make sure CM history is preserved. The config, which is persisted in the DATA_DIR host folder, is not affected by upgrades or a new CM instance image. When CM container is started (regardless of upgrade or not):
|
Hello, per our discussion here is the github Issue for the things that I was unsure about after going through the hosting/operator docs
• minimum/recommended specs for hosting (CPU, RAM, Storage, bandwidth) (I set up on an Ubuntu 22.04 LTS VM with Docker)
• I'm not sure what the operator reddit account is, or if it should be different than the bot account.
• On the setup screen, I'm not sure what Instances are, but there was just 1 option so I proceeded
• Caching
◇ “It is therefore important to re-use window criteria wherever possible to take advantage of this caching.”
▪ My takeaway from this and caching docs in general is that your first run should request the largest amount of history that will be needed anywhere in the config and then subsequent runs should match it. If you request 200 activities on the first run, it doesn't save you any API hits later to only look at 100. You don't want your runs to request their last 10 activities, then the last 30 on the next run, then the last 100.
◇ Redis must not be in the docker, web interface doesn't start if config file tries to use that as cache
◇ Are mod notes cached? userNotes are in the cache docs and I know ModNotes are said to be api intensive
◇ TTL
▪ Why does data expire?
▪ What are the implications or problems with setting these values too low or high?
▪ I'm guessing a week is way too long if the default is 1 minute, but I'm not sure why other than maybe storage space or the data is expected to change (but maybe that doesn't matter for the use case)
• Database
◇ typo “retention: '3 months' # each subreddit will retain 3 more of recorded events”
◇ Migrations, is this something I have to worry about using the docker default db? It says it will pause startup and that could mean some troubleshooting on my headless server
• Other
◇ Is there a log file to troubleshoot CM, in case the web server doesn't successfully start?
◇ How do CM updates happen with docker?
The text was updated successfully, but these errors were encountered: