profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/jacobthill/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

jacobthill/blacklight 0

Blacklight provides a discovery interface for any Solr (http://lucene.apache.org/solr) index.

jacobthill/blacklight-gallery 0

Gallery views for Blacklight

jacobthill/blacklight-oembed 0

OEmbed media views for Blacklight

jacobthill/blacklight_range_limit 0

Range facet/limit/profile plugin for Blacklight

jacobthill/dlme 0

Digital Library of the Middle East Prototype

jacobthill/dlme-helper 0

Helper scripts for DLME data analysis and transformation testing

jacobthill/intro-to-topic-modeling-TED 0

An introduction to topic modeling using a corpus of TED talks

jacobthill/metadataQA 0

Metadata Quality Analysis Scripts

PR closed sul-dlss/dlme-transform

Bod title

Why was this change made?

Some Bodleian records were missing titles and title extraction macros were missing test coverage

How was this change tested?

Local transform

Which documentation and/or configurations were updated?

n/a

+157 -68

0 comment

12 changed files

jacobthill

pr closed time in 26 minutes

PR opened sul-dlss/dlme-transform

fix Bodleian title extraction, add macro and tests

Why was this change made?

Some Bodleian records didn't have a title in the raw record

How was this change tested?

Local transform

Which documentation and/or configurations were updated?

n/a

+74 -9

0 comment

3 changed files

pr created time in 26 minutes

create barnchsul-dlss/dlme-transform

branch : bodleian-titles

created branch time in 27 minutes

PR opened sul-dlss/dlme-transform

Bod title

Why was this change made?

Some Bodleian records were missing titles and title extraction macros were missing test coverage

How was this change tested?

Local transform

Which documentation and/or configurations were updated?

n/a

+157 -68

0 comment

12 changed files

pr created time in an hour

create barnchsul-dlss/dlme-transform

branch : bod-title

created branch time in an hour

push eventsul-dlss/dlme-transform

Jacob Hill

commit sha bdb9d9d18efb99d093e4d3cee25ca99a2664d080

revise title extraction bodleian arabic

view details

push time in 3 hours

push eventsul-dlss/dlme-transform

Jacob Hill

commit sha e64572b05a4da6b3801e7061edb32d32f3847f3f

add missing bodleian lang keys

view details

push time in 3 hours

push eventsul-dlss/dlme-metadata

Jacob Hill

commit sha ff376f865f71eb0fe9af6b2ce75a31869112c8e5

refresh bnf records

view details

jacobthill

commit sha 56d092fba38b0ee925e38fa2979b0672a4cc338b

Merge pull request #221 from sul-dlss/bnf refresh bnf records

view details

push time in 5 hours

PR merged sul-dlss/dlme-metadata

refresh bnf records
+26260 -13945

0 comment

964 changed files

jacobthill

pr closed time in 5 hours

PR opened sul-dlss/dlme-metadata

refresh bnf records
+26260 -13945

0 comment

964 changed files

pr created time in 5 hours

create barnchsul-dlss/dlme-metadata

branch : bnf

created branch time in 5 hours

push eventsul-dlss/dlme-harvest

Jacob Hill

commit sha 65d1183d8720a87902af3e4be4277f09aa03e67a

revise old bnf harvest script, add tests and logging

view details

jacobthill

commit sha 86bde7012c8fd104da7d872e7ddff551c2890308

Merge pull request #91 from sul-dlss/bnf revise old bnf harvest script, add tests and logging

view details

push time in 5 hours

create barnchsul-dlss/dlme-harvest

branch : bnf

created branch time in 5 hours

PR opened sul-dlss/dlme-transform

fix language not found in auc

Why was this change made?

several auc configs were returning NOT FOUND errors for language values

How was this change tested?

local transform

Which documentation and/or configurations were updated?

n/s

+7 -6

0 comment

4 changed files

pr created time in 16 hours

create barnchsul-dlss/dlme-transform

branch : not-found

created branch time in 16 hours

push eventsul-dlss/dlme-transform

Jacob Hill

commit sha b41c484b9660ea2ccb0c816032a8dd81438e53f3

add missing lang keys to bodleian configs

view details

push time in 16 hours

PR opened sul-dlss/dlme-transform

add missing lang keys

Why was this change made?

Adds missing lang keys

How was this change tested?

local transform

Which documentation and/or configurations were updated?

n/a

+11 -11

0 comment

3 changed files

pr created time in a day

create barnchsul-dlss/dlme-transform

branch : lang

created branch time in a day

PR opened sul-dlss/dlme-transform

update shahre farang mapping

Why was this change made?

Fixes some errors in the config.

How was this change tested?

Local transform.

Which documentation and/or configurations were updated?

n/a

+12 -9

0 comment

3 changed files

pr created time in a day

create barnchsul-dlss/dlme-transform

branch : shahre

created branch time in a day

PR opened sul-dlss/dlme-transform

add missing langauge keys

Why was this change made?

A new test to catch missing language keys was written. This PR adds keys to fields that are missing them so that PR can be merged.

How was this change tested?

local transform

Which documentation and/or configurations were updated?

n/a

+6 -4

0 comment

3 changed files

pr created time in 2 days

create barnchsul-dlss/dlme-transform

branch : lang-hash

created branch time in 2 days

issue commentsul-dlss/dlme-transform

Bad thumbnails in QNL

Yeah, I saw the error in the config, but there is an error in the raw data files as well. We can fix the config to ignore the iiif manifest in those cases and grab the url from the raw data but it won't actually fix these urls until QNL updates them and we re-harvest.

I'm not sure we want to display our image placeholder if the thumbnail is not valid. I think we want to suppress the record once we have the auto harvest worked out. In that case if the url breaks, we will not reload it and it the provider fixes the error the record will get loaded on the next auto harvest. Does that make sense?

jacobthill

comment created time in 7 days

issue commentsul-dlss/dlme-transform

Bad urls should not break application

This is captured in the google doc, but maybe first we should check to see if they are valid. I'm pretty sure the application doesn't break when a url doesn't resolve as long as its valid. It seems to only break when you pass something that doesn't look like a valid url. The other issue, of course, is when we re-harvest a set we should test urls and suppress those that no longer resolve so we keep the site clean or broken urls.

jacobthill

comment created time in 7 days

issue openedsul-dlss/dlme-transform

Bad urls should not break application

Currently if a bad url is loaded in the agg_preview field (probably agg_is_shown_at as well), the following error message is displayed when clicking on the set of records containing the bad url (e.g. through selecting the data contributor): Screen Shot 2020-08-21 at 9 20 28 AM Selecting exhibit dashboard to unload the records results in the same message.

Loading bad urls is part of the transformation process. I have error checking locally but it only tests if the url is valid not if it is resolvable. I also forget to run that check sometimes. In some cases, particularly when the collection is large, it is difficult to find the record with the bad url.

Desired behavior:

  • [ ] The application will not break completely when a bad url is loaded; it should still be possible to access the exhibit dashboard to unload the records.
  • [ ] The error message will indicate that the url will not resolve and return the dlme_source_file value (if available) and the id instead of "We're sorry, but something went wrong."

created time in 7 days

issue openedsul-dlss/dlme-transform

Bad thumbnails in QNL

A few QNL records have bad thumbnails leading to pages that will not render in the application:

  • https://app.honeybadger.io/projects/53082/faults/80088751/01FB4G4F3PB8QZT061807CR2C0#notice-summary

  • For example, we can draw this page: https://dlmenetwork.org/ar/library/catalog?f%5Bdate%5D%5B%5D=1400&page=12565

  • but not this one: https://dlmenetwork.org/ar/library/catalog?f%5Bdate%5D%5B%5D=1400&page=12561

Known bad ids are:

  • 81055/vdc_100023618815.0x000007 from record qnl/british-library-combined/data/group7/qnl-combined-6815.xml
  • 81055/vdc_100023618815.0x000005 from record qnl/british-library-combined/data/group17/qnl-combined-16488.xml

Need to:

  • [ ] reharvest QNL
  • [ ] see if the data is the same and revise mapping config, if needed

created time in 7 days

PR opened sul-dlss/dlme-transform

update agg_is_shown_at urls

Why was this change made?

The resource urls changes since last harvest

How was this change tested?

Local transform and load in dev

Which documentation and/or configurations were updated?

n/a

+1 -1

0 comment

1 changed file

pr created time in 8 days

create barnchsul-dlss/dlme-transform

branch : princeton

created branch time in 8 days