Skip to content

Commit

Permalink
improved the crawler and expoerted all the categories json files
Browse files Browse the repository at this point in the history
  • Loading branch information
seekshreyas committed Dec 5, 2013
1 parent 2f3f1a6 commit 82c7dc1
Show file tree
Hide file tree
Showing 21 changed files with 1,786 additions and 6 deletions.
10 changes: 6 additions & 4 deletions crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,13 +67,15 @@ def get_ids_from_html(html_file_name):
appFileObj.close()


appListSoup = BeautifulSoup(appFile, 'html5')
appListElem = appListSoup.findAll('div', {"class":"card"})
appListSoup = BeautifulSoup(appFile, 'html5lib')
appListElem = appListSoup.find_all('div', class_="card")

print len(appListElem)

appId = [elem['data-docid'] for elem in appListElem]


# pprint(appId)
print "total apps : ", len(appId)
baseurl = "https://play.google.com//store/apps/details?id="
inputfilename = html_file_name.split('.')

Expand Down Expand Up @@ -102,7 +104,7 @@ def main():

userinput = getAppCategory()

pprint(userinput)
# pprint(userinput)

get_ids_from_html(userinput['file'])

Expand Down
2 changes: 1 addition & 1 deletion exports/free_biz_all.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion exports/free_biz_reviews.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_comics_all.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_comics_reviews.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_communications_all.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_communications_reviews.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_lifestyle_all.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_lifestyle_reviews.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_social_all.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions exports/free_social_reviews.json

Large diffs are not rendered by default.

60 changes: 60 additions & 0 deletions inputs/free_biz_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
https://play.google.com//store/apps/details?id=com.facebook.pages.app
https://play.google.com//store/apps/details?id=com.indeed.android.jobsearch
https://play.google.com//store/apps/details?id=com.squareup
https://play.google.com//store/apps/details?id=com.rhmsoft.fm
https://play.google.com//store/apps/details?id=com.quickoffice.android
https://play.google.com//store/apps/details?id=cn.wps.moffice_eng
https://play.google.com//store/apps/details?id=com.netqin.ps
https://play.google.com//store/apps/details?id=com.dataviz.docstogo
https://play.google.com//store/apps/details?id=com.box.android
https://play.google.com//store/apps/details?id=com.adpmobile.android
https://play.google.com//store/apps/details?id=com.ups.mobile.android
https://play.google.com//store/apps/details?id=com.microsoft.rdc.android
https://play.google.com//store/apps/details?id=com.mobisystems.office
https://play.google.com//store/apps/details?id=com.dynamixsoftware.printershare
https://play.google.com//store/apps/details?id=com.usps
https://play.google.com//store/apps/details?id=com.paypal.here
https://play.google.com//store/apps/details?id=com.olivephone.edit
https://play.google.com//store/apps/details?id=com.monster.android.Views
https://play.google.com//store/apps/details?id=com.citrixonline.android.gotomeeting
https://play.google.com//store/apps/details?id=com.estrongs.android.taskmanager
https://play.google.com//store/apps/details?id=com.citrix.Receiver
https://play.google.com//store/apps/details?id=com.fedex.ida.android
https://play.google.com//store/apps/details?id=com.facetime_plus.trendy
https://play.google.com//store/apps/details?id=com.splashtop.remote.pad.v2
https://play.google.com//store/apps/details?id=com.netqin.mm
https://play.google.com//store/apps/details?id=com.concur.breeze
https://play.google.com//store/apps/details?id=com.intsig.BCRLite
https://play.google.com//store/apps/details?id=com.nitrodesk.droid20.nitroid
https://play.google.com//store/apps/details?id=com.godaddy.mobile.android
https://play.google.com//store/apps/details?id=com.glassdoor.app
https://play.google.com//store/apps/details?id=com.cisco.webex.meetings
https://play.google.com//store/apps/details?id=com.authy.authy
https://play.google.com//store/apps/details?id=com.snagajob.jobseeker
https://play.google.com//store/apps/details?id=com.intuit.intuitgopayment
https://play.google.com//store/apps/details?id=com.thehomedepot.proapp
https://play.google.com//store/apps/details?id=com.stoik.mdscanlite
https://play.google.com//store/apps/details?id=com.microsoft.office.lync15
https://play.google.com//store/apps/details?id=com.enlightment.appslocker
https://play.google.com//store/apps/details?id=com.domobile.aut.bspace
https://play.google.com//store/apps/details?id=com.airwatch.androidagent
https://play.google.com//store/apps/details?id=kr.dinosoft.android.ExperienceG3Note_gl
https://play.google.com//store/apps/details?id=com.microsoft.office.lync
https://play.google.com//store/apps/details?id=com.rhythm.hexise.task
https://play.google.com//store/apps/details?id=com.wyse.pocketcloudfree
https://play.google.com//store/apps/details?id=com.mogulsoftware.android.BackPageCruiser
https://play.google.com//store/apps/details?id=com.IQBS.android.app2sd
https://play.google.com//store/apps/details?id=co.securifox.android
https://play.google.com//store/apps/details?id=mobi.infolife.installer
https://play.google.com//store/apps/details?id=com.intuit.quickbooks
https://play.google.com//store/apps/details?id=de.joergjahnke.documentviewer.android.free
https://play.google.com//store/apps/details?id=com.nitrodesk.honey.nitroid
https://play.google.com//store/apps/details?id=com.rhmsoft.fm.hd
https://play.google.com//store/apps/details?id=com.infraware.polarisoffice.entbiz.gd
https://play.google.com//store/apps/details?id=com.j2.efax
https://play.google.com//store/apps/details?id=com.mobileiron
https://play.google.com//store/apps/details?id=cn.wps.moffice_i18n
https://play.google.com//store/apps/details?id=com.mobisystems.editor.office_with_reg
https://play.google.com//store/apps/details?id=com.tux.client
https://play.google.com//store/apps/details?id=air.com.adobe.connectpro
https://play.google.com//store/apps/details?id=com.hi.applock
294 changes: 294 additions & 0 deletions inputs/free_comics.html

Large diffs are not rendered by default.

60 changes: 60 additions & 0 deletions inputs/free_comics_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
https://play.google.com//store/apps/details?id=com.iconology.comics
https://play.google.com//store/apps/details?id=com.supo.pocket.mangareader
https://play.google.com//store/apps/details?id=com.marvel.comics
https://play.google.com//store/apps/details?id=cn.rtfsc.searchmanga
https://play.google.com//store/apps/details?id=com.mangareader.edrem
https://play.google.com//store/apps/details?id=ru.jecklandin.stickman
https://play.google.com//store/apps/details?id=com.rookiestudio.perfectviewer
https://play.google.com//store/apps/details?id=com.vn.animewallpaper
https://play.google.com//store/apps/details?id=com.wfox.game.jigsaw.puzzle.princess
https://play.google.com//store/apps/details?id=com.dccomics.comics
https://play.google.com//store/apps/details?id=tango.video.calling.guide
https://play.google.com//store/apps/details?id=com.skollabs.funny
https://play.google.com//store/apps/details?id=com.srsdev.allfacts
https://play.google.com//store/apps/details?id=cat.bcnmultimedia.paraboles
https://play.google.com//store/apps/details?id=com.walking_dead1_thuong.iphotome
https://play.google.com//store/apps/details?id=net.androidcomics.acv
https://play.google.com//store/apps/details?id=com.monotype.android.font.strong.conmicpack
https://play.google.com//store/apps/details?id=com.crunchyroll.crmanga
https://play.google.com//store/apps/details?id=com.loudcrow.marvelavengers
https://play.google.com//store/apps/details?id=com.hd.live.wallpaper.beauty.anime
https://play.google.com//store/apps/details?id=sextube.sexygirls.sexpositions.sexposition
https://play.google.com//store/apps/details?id=com.cyo.comicrack.viewer.free
https://play.google.com//store/apps/details?id=com.funny.haha.jokes.yomomma
https://play.google.com//store/apps/details?id=com.tmarki.comicmaker
https://play.google.com//store/apps/details?id=com.komik.free
https://play.google.com//store/apps/details?id=com.skollabs.tattoomen
https://play.google.com//store/apps/details?id=com.garciahierro.ragecomics
https://play.google.com//store/apps/details?id=com.pompeiicity.funpic
https://play.google.com//store/apps/details?id=cmobile.com.kidsmovie.free
https://play.google.com//store/apps/details?id=com.androidity.wallpaper.funny2
https://play.google.com//store/apps/details?id=com.funpokes.pokemonlove
https://play.google.com//store/apps/details?id=com.darkhorse.digital
https://play.google.com//store/apps/details?id=com.kauf.weapons.gunsanddestruction
https://play.google.com//store/apps/details?id=com.peppa.pig
https://play.google.com//store/apps/details?id=com.gau.go.launcherex.theme.cupcakes.theoya
https://play.google.com//store/apps/details?id=com.marvel.unlimited
https://play.google.com//store/apps/details?id=com.ht.manga.panda
https://play.google.com//store/apps/details?id=com.nhn.android.webtoon
https://play.google.com//store/apps/details?id=com.walking_dead2_thuong.iphotome
https://play.google.com//store/apps/details?id=com.amuniversal.android.gocomics
https://play.google.com//store/apps/details?id=com.floern.xkcd
https://play.google.com//store/apps/details?id=com.ymi.littleponywp
https://play.google.com//store/apps/details?id=com.notabasement.mangarock.android.mckinley
https://play.google.com//store/apps/details?id=com.Lock.Wallpaper.zKittyLock
https://play.google.com//store/apps/details?id=com.walking_dead3_thuong.iphotome
https://play.google.com//store/apps/details?id=com.kogatiy.gothemes.hellokitty
https://play.google.com//store/apps/details?id=com.rocky.pinkkittyclockwidget
https://play.google.com//store/apps/details?id=com.walking_dead4_thuong.iphotome
https://play.google.com//store/apps/details?id=vaha.android.vreaderfree
https://play.google.com//store/apps/details?id=com.kauf.imagefaker.photofunfunnypicscreator
https://play.google.com//store/apps/details?id=com.marv.marvelherowallpapers
https://play.google.com//store/apps/details?id=com.walking_dead7_thuong.iphotome
https://play.google.com//store/apps/details?id=com.walking_dead5_thuong.iphotome
https://play.google.com//store/apps/details?id=com.vizmanga.android
https://play.google.com//store/apps/details?id=com.walking_dead6_thuong.iphotome
https://play.google.com//store/apps/details?id=com.monotype.android.font.strong.cartoon
https://play.google.com//store/apps/details?id=com.appspot.swisscodemonkeys.lovequotes
https://play.google.com//store/apps/details?id=best.bitstrip
https://play.google.com//store/apps/details?id=com.mangazoo.ninecols
https://play.google.com//store/apps/details?id=com.c8jii.dragonball
294 changes: 294 additions & 0 deletions inputs/free_communications.html

Large diffs are not rendered by default.

60 changes: 60 additions & 0 deletions inputs/free_communications_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
https://play.google.com//store/apps/details?id=com.facebook.orca
https://play.google.com//store/apps/details?id=com.skype.raider
https://play.google.com//store/apps/details?id=kik.android
https://play.google.com//store/apps/details?id=com.whatsapp
https://play.google.com//store/apps/details?id=com.yahoo.mobile.client.android.mail
https://play.google.com//store/apps/details?id=com.jb.gosms
https://play.google.com//store/apps/details?id=com.antivirus
https://play.google.com//store/apps/details?id=com.viber.voip
https://play.google.com//store/apps/details?id=com.android.chrome
https://play.google.com//store/apps/details?id=jp.naver.line.android
https://play.google.com//store/apps/details?id=org.mozilla.firefox
https://play.google.com//store/apps/details?id=com.google.android.apps.googlevoice
https://play.google.com//store/apps/details?id=com.pinger.ppa
https://play.google.com//store/apps/details?id=com.rebelvox.voxer
https://play.google.com//store/apps/details?id=com.textmeinc.textme
https://play.google.com//store/apps/details?id=mobi.mgeek.TunnyBrowser
https://play.google.com//store/apps/details?id=kr.core.technology.wifi.hotspot
https://play.google.com//store/apps/details?id=com.tencent.mm
https://play.google.com//store/apps/details?id=com.yahoo.mobile.client.android.im
https://play.google.com//store/apps/details?id=com.handcent.nextsms
https://play.google.com//store/apps/details?id=com.kakao.talk
https://play.google.com//store/apps/details?id=com.glidetalk.glideapp
https://play.google.com//store/apps/details?id=com.outlook.Z7
https://play.google.com//store/apps/details?id=com.bbm
https://play.google.com//store/apps/details?id=com.talkatone.android
https://play.google.com//store/apps/details?id=com.handcent.plugin.emoji
https://play.google.com//store/apps/details?id=com.mrnumber.blocker
https://play.google.com//store/apps/details?id=com.google.android.gm
https://play.google.com//store/apps/details?id=com.foxfi
https://play.google.com//store/apps/details?id=com.opera.mini.android
https://play.google.com//store/apps/details?id=com.appsverse.photon
https://play.google.com//store/apps/details?id=com.webascender.callerid
https://play.google.com//store/apps/details?id=com.google.android.talk
https://play.google.com//store/apps/details?id=com.sec.spp.push
https://play.google.com//store/apps/details?id=com.p1.chompsms
https://play.google.com//store/apps/details?id=net.comcast.ottclient
https://play.google.com//store/apps/details?id=com.mediafriends.chime
https://play.google.com//store/apps/details?id=com.svtechpartners.wifihotspotdemo
https://play.google.com//store/apps/details?id=com.wifi.hotspot
https://play.google.com//store/apps/details?id=com.jiubang.browser
https://play.google.com//store/apps/details?id=com.opera.browser
https://play.google.com//store/apps/details?id=com.adn37.omegleclientlite
https://play.google.com//store/apps/details?id=com.quoord.tapatalkxda.activity
https://play.google.com//store/apps/details?id=com.connectivityapps.hotmail
https://play.google.com//store/apps/details?id=com.sec.chaton
https://play.google.com//store/apps/details?id=com.foxfi2
https://play.google.com//store/apps/details?id=mobi.androidcloud.app.ptt.client
https://play.google.com//store/apps/details?id=com.p1.chompsms.emoji
https://play.google.com//store/apps/details?id=com.youmail.android.vvm
https://play.google.com//store/apps/details?id=com.yahoo.mobile.client.android.imvideo
https://play.google.com//store/apps/details?id=com.moplus.gvphone
https://play.google.com//store/apps/details?id=it.medieval.blueftp
https://play.google.com//store/apps/details?id=com.vonage.TimeToCall
https://play.google.com//store/apps/details?id=com.verizon.messaging.vzmsgs
https://play.google.com//store/apps/details?id=com.jb.gosms.theme.getjar.christmaseve
https://play.google.com//store/apps/details?id=com.moplus.moplusapp
https://play.google.com//store/apps/details?id=com.contapps.android
https://play.google.com//store/apps/details?id=com.zlango.zms
https://play.google.com//store/apps/details?id=net.idt.um.android.bossrevapp
https://play.google.com//store/apps/details?id=com.yuilop
294 changes: 294 additions & 0 deletions inputs/free_lifestyle.html

Large diffs are not rendered by default.

60 changes: 60 additions & 0 deletions inputs/free_lifestyle_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
https://play.google.com//store/apps/details?id=com.target.socsav
https://play.google.com//store/apps/details?id=com.zillow.android.zillowmap
https://play.google.com//store/apps/details?id=com.dominospizza
https://play.google.com//store/apps/details?id=com.starbucks.mobilecard
https://play.google.com//store/apps/details?id=info.androidz.horoscope
https://play.google.com//store/apps/details?id=com.appsorama.bday
https://play.google.com//store/apps/details?id=com.cocacola.droid.pushplay
https://play.google.com//store/apps/details?id=com.zwigglers.android.horoscopes
https://play.google.com//store/apps/details?id=com.tinder
https://play.google.com//store/apps/details?id=com.benaughty
https://play.google.com//store/apps/details?id=com.wrapp.android
https://play.google.com//store/apps/details?id=com.urbandroid.sleep
https://play.google.com//store/apps/details?id=com.yum.pizzahut
https://play.google.com//store/apps/details?id=com.move.realtor
https://play.google.com//store/apps/details?id=com.papajohns.android
https://play.google.com//store/apps/details?id=com.sykora.neonalarm.free
https://play.google.com//store/apps/details?id=com.life360.android.safetymapd
https://play.google.com//store/apps/details?id=com.michaels.michaelsstores
https://play.google.com//store/apps/details?id=com.trulia.android
https://play.google.com//store/apps/details?id=smsr.com.cw
https://play.google.com//store/apps/details?id=com.skcc.corfire.dd
https://play.google.com//store/apps/details?id=com.listia.Listia
https://play.google.com//store/apps/details?id=com.kohls.mcommerce.opal
https://play.google.com//store/apps/details?id=com.kicksonfire.android
https://play.google.com//store/apps/details?id=com.goitab.goAdfun
https://play.google.com//store/apps/details?id=com.kbb.mobile
https://play.google.com//store/apps/details?id=mmapps.mirror.free
https://play.google.com//store/apps/details?id=fr.telemaque.horoscope
https://play.google.com//store/apps/details?id=com.fifthfinger.clients.joann
https://play.google.com//store/apps/details?id=com.hobbylobbystores.android
https://play.google.com//store/apps/details?id=com.houzz.app
https://play.google.com//store/apps/details?id=com.cg.android.birthdaycountdown
https://play.google.com//store/apps/details?id=com.trulia.android.rentals
https://play.google.com//store/apps/details?id=com.vinted
https://play.google.com//store/apps/details?id=com.vp.alarmClockPlusDock
https://play.google.com//store/apps/details?id=com.rockyou.horoscopes
https://play.google.com//store/apps/details?id=com.victoriassecret.pinknation
https://play.google.com//store/apps/details?id=com.allrecipes.spinner.free
https://play.google.com//store/apps/details?id=com.autotrader.android
https://play.google.com//store/apps/details?id=com.cars.android
https://play.google.com//store/apps/details?id=com.sei.android
https://play.google.com//store/apps/details?id=com.greatclips.android
https://play.google.com//store/apps/details?id=com.safeway.client.android.safeway
https://play.google.com//store/apps/details?id=com.jcp
https://play.google.com//store/apps/details?id=com.wallpapers.backgroud.hd
https://play.google.com//store/apps/details?id=ch.bitspin.timely
https://play.google.com//store/apps/details?id=com.ae.ae
https://play.google.com//store/apps/details?id=com.scripps.android.foodnetwork
https://play.google.com//store/apps/details?id=com.tweakersoft.aroundme
https://play.google.com//store/apps/details?id=com.digitaloutcrop.mixology
https://play.google.com//store/apps/details?id=lmontt.cl
https://play.google.com//store/apps/details?id=usol.org.vn.sexposition.activity
https://play.google.com//store/apps/details?id=com.chris.android.mydaysfree
https://play.google.com//store/apps/details?id=com.bigoven.android
https://play.google.com//store/apps/details?id=com.carmax.carmax
https://play.google.com//store/apps/details?id=com.aaa.android.discounts
https://play.google.com//store/apps/details?id=com.alarm.alarmmobile.android
https://play.google.com//store/apps/details?id=com.mobilaurus.wingstopandroid
https://play.google.com//store/apps/details?id=com.hm
https://play.google.com//store/apps/details?id=com.msart.emojikeyboard
294 changes: 294 additions & 0 deletions inputs/free_personalization.html

Large diffs are not rendered by default.

294 changes: 294 additions & 0 deletions inputs/free_social.html

Large diffs are not rendered by default.

Loading

0 comments on commit 82c7dc1

Please sign in to comment.