Sidekiq worker process that controls the kafka consumers.
Currently there are four consumers and one worker, all located here:
- attribution: listens to install events and matches these to clicks, to attribute the user to a network. Generates conversion events when a match is identified.
- clickstore: responsible for store click events to redis and collection statistics on campaign links
- conversion: responsible for handling conversion events and generating postback calls for these events.
- postbacks: generates postbacks calls for all events where required. This does not handle conversion events.
- url_worker: worker for triggering postback urls generated by the postbacks and conversion consumers.
Consumers are sidekiq workers and are run by sidekiq. However these are not directly scheduled with sidekiq, instead for each consumer, there is a scheduler that queues the consumers to be executed.
The intention is to have short running processes consuming events and then stoping, only to be restarted by the scheduler. This avoids have memory leaks or zombie consumer processes.
An example of a typical kafka message:
/t/ist bot_name&country=DE&device=smartphone&device_name=iPhone&ip=3160898477&klag=1&platform=ios&ts=1465287056 adid=ECC27E57-1605-2714-CCCC-13DC6DFB742E
- First comes the event type, the actual type is assumed to be everything after the final '/' (slash).
- Meta dataset, this is in the form of CGI encoded parameter/value pairs.
The meta data is generated exclusively by the kafkastore and its values
are based on the IP and user agent information. In addition, there is
klag
value the represents the time (in seconds) of how long the message waited in redis before being pushed to kafka. - Query string of the original request. This is just passed through from the tracker, unmodified.
If this format should change, then the consumers need updating. However this is only the case if the format changes
(i.e. <type> <meta> <params>
), not if there are extra "meta" or
"query" parameters included.
Also if the format is changed here, then the kafkastore needs updating.
The format is explained in more details elsewhere.
Generate a .env
and then fill it with values:
prompt> rake appjson:to_dotenv
prompt> $EDITOR .env
Start the worker and web frontend with:
prompt> foreman start web
prompt> foreman start worker
Easiest way to deploy this, is to use heroku!