|
|
Changes for next data load
Schema
- Add author_index field so author and author2 don't need to be searched simultaneously
- Add tokenized title field
Document processing
- Duplicate subject headings removed
- Do not load electronic journals with no 856 (identified by "Electronic journals" in 655, Ebsco in 710, [electronic resource] in GMD)
- Do not load records with invalid OCLC in 001
- Do not load items with blank cat date
- Fix spacing problems between 245|a and |b
- Fix invalid unicode problems
Changes to be document ranking
- Boost documents with many holdings
- Boost title keywords
- Boost 651|a
|