Stats2 takes list of requests from RequestManager2 and stores smaller (not all attributes) copy of all requests. It also takes nuber of open/done events from dbs and collects history for each dataset in all requests. Main improvement over old Stats is that Stats2 fetch only changes of workflows since last update and fetches events only for these workflows which are currently in production. As a result, updates take about 10-30 minutes instead of 1-2 hours.
Before running the web application or the update script stats_update.py
, please set the following elements.
- Path to Grid Certificates to consume resources available in CMS WEB
USERCRT
andUSERKEY
- Basic authentication header to authenticate write requests to the DB
STATS_DB_AUTH_HEADER
. Basic authentication header consists of words Basic and base64 encoded "username:password" value, for example:"Basic dXNlcjpwYXNzd29yZA=="
.
The following settings are related to trigger updates outside the Stats2 application:
- Credentials to request an access token via client credential grant to trigger updates in external PdmV Services:
McM
,RelVal
,ReReco
. Set these credentials under the environment variables:CALLBACK_CLIENT_ID
&CALLBACK_CLIENT_SECRET
. Remember to link the role related to PdmV Service operationscms-pdmv-serv
to be allowed to perform this kind of operation. - Set the target audiences, this determines the applications where the requested token is going to be valid. Set the target audience for the production
environment application via
APPLICATION_CLIENT_ID
andDEV_APPLICATION_CLIENT_ID
for development.
Optional:
- If the database is not reachable in
localhost
, you can overwrite its URL via the environment variableDB_URL
export USERCRT=/.../user.crt.pem
export USERKEY=/.../user.key.pem
export STATS_DB_AUTH_HEADER="Basic base64encodedvalue"
export APPLICATION_CLIENT_ID='...'
export DEV_APPLICATION_CLIENT_ID='...'
export CALLBACK_CLIENT_ID='...'
export CALLBACK_CLIENT_SECRET='...'
# Optional
export DB_URL='....'
python3 stats_updater.py --action update
Usually Stats2 should be used from web browser, but it can be used as a python script as well: Update all requests:
python3 stats_update.py --action update
Update one request (NAME is request name):
python3 stats_update.py --action update --name NAME
Preview one request (NAME is request name):
python3 stats_update.py --action see --name NAME
Note that update actions require two environment variables: USERKEY
and USERCRT
which should point to user GRID certificate and key files.
Install Python 3.11 (or try to install a higher version). To achieve this, you can install it directly or by using pyenv Then, install all the required depedencies (in a virtualenv if you like):
python3.11 -m pip install -r requirements.txt
Add:
[bintray--apache-couchdb-rpm]
name=bintray--apache-couchdb-rpm
baseurl=http://apache.bintray.com/couchdb-rpm/el$releasever/$basearch/
gpgcheck=0
repo_gpgcheck=0
enabled=1
To:
/etc/yum.repos.d/bintray-apache-couchdb-rpm.repo
or on puppet managed machines:
/etc/yum-puppet.repos.d/bintray-apache-couchdb-rpm.repo
sudo yum -y install epel-release
sudo yum -y install couchdb
CouchDB is installed to /opt/couchdb
directory
Following views must be created in requests
database, in _designDoc
design document.
Campaigns (view name:campaigns
):
function (doc) {
if (doc.Campaigns) {
addedCampaigns = {};
var i;
for (i = 0; i < doc.Campaigns.length; i++) {
if (!(doc.Campaigns[i] in addedCampaigns)) {
emit(doc.Campaigns[i], doc.RequestName);
addedCampaigns[doc.Campaigns[i]] = true;
}
}
}
}
Input datasets (view name:inputDatasets
):
function (doc) {
var i;
if (doc.InputDataset) {
emit(doc.InputDataset, doc.RequestName);
}
}
Output datasets (view name:outputDatasets
):
function (doc) {
var i;
if (doc.OutputDatasets) {
for (i = 0; i < doc.OutputDatasets.length; i++) {
emit(doc.OutputDatasets[i], doc.RequestName);
}
}
}
PrepIDs (view name:prepids
):
function (doc) {
if (doc.PrepID) {
emit(doc.PrepID, doc.RequestName);
}
}
Request types (view name:types
):
function (doc) {
if (doc.RequestType) {
emit(doc.RequestType, doc.RequestName);
}
}
Processing strings (view name:processingStrings
):
function (doc) {
if (doc.ProcessingString && doc.ProcessingString.length > 0) {
emit(doc.ProcessingString, doc.RequestName);
}
}
Requests (tasks) (view name:requests
):
function (doc) {
if (doc.Requests) {
addedRequests = {};
var i;
for (i = 0; i < doc.Requests.length; i++) {
if (!(doc.Requests[i] in addedRequests)) {
emit(doc.Requests[i], doc.RequestName);
addedRequests[doc.Requests[i]] = true;
}
}
}
}
Stats2 CouchDB should be available to everyone to read, but no one, except admin should be allowed to update it.
In CouchDB settings: require_valid_user
must be set to false
.
In requests
and settings
databases a new design document must be created:
{
"_id": "_design/validate_write",
"validate_doc_update": "function (newDoc, oldDoc, userCtx) { if (userCtx.roles.indexOf('_admin') === -1) throw( { forbidden : 'Only admins can modify the database.'} ); }"
}
This design document checks if user, who is trying to make changes, has _admin
role.
git clone https://github.com/cms-PdmV/Stats2.git
Launch the main.py
python3 main.py [--host] [--port] [--debug]
There are three available arguments when launching a website: --host
, --port
and --debug
. Website by default is launched with host ip 127.0.0.1 on port 80. Host can be overwritten with --host
. Port can be overwritten with --port
. Parameter --debug
launches flask server in debug mode.
python3.11 main.py &
python3.11 main.py --host 0.0.0.0 --port 8000 &
python3.11 main.py --debug
Create a file /etc/systemd/system/stats.service
to run Stats2 as a service. File contents:
[Unit]
Description = PdmVs Stats service website
After = network.target
[Service]
Type = simple
WorkingDirectory=/home/pdmvserv/Stats2
ExecStart = /bin/python3.11 main.py --host 0.0.0.0 --port 80
Restart=on-failure
RestartSec=20
ExecStop=/bin/kill -TERM \$MAINPID
[Install]
WantedBy = multi-user.target
Periodic database compaction must be performed to prevent host of running out of space:
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/settings/_compact -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/campaigns -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/outputDatasets -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/prepids -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/processingStrings -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/requests -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_compact/_designDoc/types -H "Authorization: Basic $STATS_DB_AUTH_HEADER"
curl -s -k -H "Content-Type: application/json" -X POST http://localhost:5984/requests/_view_cleanup -H "Authorization: Basic $STATS_DB_AUTH_HEADER"