Skip to content
This repository has been archived by the owner on Mar 5, 2019. It is now read-only.

rails: add ui for manual deduplication workflow #153

Open
aspiers opened this issue Feb 7, 2017 · 0 comments
Open

rails: add ui for manual deduplication workflow #153

aspiers opened this issue Feb 7, 2017 · 0 comments

Comments

@aspiers
Copy link
Member

aspiers commented Feb 7, 2017

Description

At the 2017/2/7 meeting we agreed that we needed a UI in the Rails app which supported manual deduplication ("cleansing") of entities which were still unclean after the automatic deduplication pass. This UI would need to be friendly enough to be usable by non-technical volunteers, for example those attending a "data cleansing hackathon day" event which we also proposed in the same meeting.

Blocked by

Comments, Questions and Considerations

Essentially the workflow needs to support the following sequence of events:

  • allow selection of an entity type (person, organization, or government office)
  • provide a list of unclean entities of that type in the database, preferably sorted in descending order by probability of a match against an existing clean entity already in the database (where the probability is calculated by the automated matching heuristics)
  • allow selection of one individual unclean entity
  • allow browsing of clean entities which could be a possible match for that unclean entity (against preferably sorted in descending order by probability of a match)
  • allow marking of the unclean entity as clean (see db: express whether an entity has been 'cleaned' or not #150), i.e. a new non-duplicate entity, thereby removing it from the unclean list
  • allow marking of the unclean entity as a duplicate of a clean entity, which would cause the unclean entity to be "merged with" the clean one

This final step would manifest itself in the following sequence of database changes:

  • a new record would be inserted in the entity name database table (see db: extract entity names into separate table #152) with name matching the unclean entity, and with the entity id foreign key equal to the id of the matched clean entity
  • any references from other tables to the unclean entity would be changed to refer to the matched clean entity
  • the unclean entity would be removed from the database.

Acceptance Criteria

This story can be considered done when the following acceptance tests
are satisfied:

Given a database populated with both clean and unclean entities of the same type
When I visit the Rails frontend
Then I can go through unclean entities one by one, either deduplicating them or marking them as new, clean entities.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants