Releases: bomquote/transistor
pypi 0.2.4 release
v0.2.2
Fixed a bug in BaseWorker.load_items()
method which previously resulted
in losing scrape data when the number of workers did not equal the number
of tasks. Now, using any number of workers or pool size will result in
consistent export/save results. While scrape time will change proportional
to the number of workers assigned. Wrote tests to ensure the same.
v0.2.1
Added url
parameter to the WorkGroup
which is a bit more attractive
API, instead of including the URL in a kwarg. The reason why the URL was
originally included as a kwarg is that depending on how the custom
Spider
is set up, the URL may already be specified, and it is redundant to
specify it again. But for API clarity sake, now we just insist the URL is
specified in the WorkGroup
. At least, it is easier to read at a quick glance.
v0.2.0
Many API breaking changes.
See README at https://github.com/bomquote/transistor/blob/master/CHANGES
v0.1.1
- standardized SplashScraper attributes:
auth
,baseurl
,browser
,cookies
,
crawlera_user
,http_session_timeout
,http_session_valid
,LUA_SOURCE
,
max_retries
,name
,number
,referrer
,searchurl
,splash_args
,user_agent
. - now, nearly all of the SplashScraper attributes can be set via
**kwargs
if desired - when initializing a StatefulBook instance, use a
**kwarg
calledkeywords
to set
the name of the spreadsheet column heading which contains the target search terms.
For example:keywords='titles'
orkeywords='part_numbers'
. Defaults to "item".
First tagged/PyPI'd version
v0.1.0 Merge remote-tracking branch 'origin/master'