Management and analysis of data related to Glassdoor platform job postings
Report Bug
Table of Contents
Management of structured and unstructured data
Design and implementation of a database relating to ads placed on the Glassdoor platform, the development steps are listed below:
- Requirements analysis: understanding of the domain, definition of a preliminary sector scheme and of the operations that can be carried out by the system; -Conceptual modeling: application of conceptual modeling techniques and definition of an object-oriented conceptual scheme;
- Logical modeling: application of logic modeling techniques and definition of a logical E-R scheme;
- Physical modeling: transformation of the logical scheme into a physical scheme through the use of DDL and DML necessary for the definition of a database; -Operations: implementation of user operations regarding the management and analysis of data relating to the ads on the platform, their geographical position and the reviews associated with them through QL.
Software: draw.io [Conceptual / Logical Design], Datagrip [SQL IDE], Pycharm [Python IDE], GATE [NLP]
DBMS: PostgreSQL (PostGIS - geographic extension)
Pipeline NLP: Corpus PMI extraction, NER, Corpus Augmented TF-IDF / KyotoDomainRelevance extraction
Data source: 'https://www.kaggle.com/andresionek/data-jobs-listings-glassdoor'
- PostgreSQL
- Clone the repo
git clone https://github.com/ClaudioPoli/JobAds.git
- That's it!
The project involves creating a database in PostgreSQL using SQL scripts, in order you need to run:
- DDL
- Constraint
- DML
Subsequently it is possible to execute the scripts related to the queries useful for the extraction of different information. Executable queries can be found in:
- BenefitAnalysis
- NumberOfListings
- JobAnalysis
- IndustryAnalysis
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE.txt
for more information.