• ☏ +44 (0)1332 844 030
  • Contact Us
  • Support
  • UK, EU, Rest of WorldUK, EU, Rest of World
    • UK, EU, Rest of WorldUK, EU, Rest of World
    • IrelandIreland
    • North AmericaNorth America
    • CanadaCanada
    • NederlandsNederlands
Cloud Based Library and Archive Management Software Solutions Supplier
  • Products
    • Our Products
    • Soutron Information Management System
    • Soutron Library Management System
    • Soutron Archive Management System
    • Soutron Combined Library and Archive System
    • Soutron Discovery System
    • Clio Inter Library Loans
    • >> Compare our Products
  • Solutions
    • Our Solutions
    • Soutron for Business Archives
    • Soutron for Legal Libraries
    • Industry Sectors
      • Charities and Not For Profit
      • Corporate
      • Healthcare
      • NHS Solutions
      • Research Solutions
  • Services
    • Our Services
    • Consultancy
    • Data Migration
      • Data Migration Frequently Asked Questions FAQs
    • Soutron Integrations / Plugins
    • Product Support and Training
  • Articles & Events
    • Soutron Global News
    • Press Releases
    • Blog
    • Events
    • Latest On-Demand Webinars
    • Join our Newsletter
  • About Us
    • About Soutron Global
    • Our History
    • Testimonials
    • Fact Sheets
    • White Papers
    • Access Case Studies
    • Demonstration
    • Management Team
    • Partner Pages
    • Become a Partner
    • Careers
    • Press and Media
  • Request a Demo
Select Page

Home » Industry News » Indexing Metadata and Discovery in 2016

Indexing Metadata and Discovery in 2016

by Graham Beastall | Jan 27, 2016 | Industry News, Soutron Blog, Soutron Company Updates, Soutron Product News | 0 comments

Metadata Indexing Discovery in 2016

Now that the festive season is over, what are we up to? This year’s remarkably mild weather, 12-14 degrees in January is unheard of. There is no excuse to jet off to a warm climate this month!

Providing e-Books to Africa

The first thing I have to report is that we have a new database in-house of some 734,000 eBooks from over 6000 publishers. Soutron LMS is being used as the Staging Server to prepare data for publishing to an online e-commerce eBooks service in Africa. This is a production level service that is a critical part of selling eBooks complete with Adobe DRM, using Soutron technology.

The interesting thing about this is not the e-commerce website, although we have built up a vast amount of knowledge in preparing the site and delivering it complete with dynamic currency conversion and affiliate marketing. Rather it is utilising the power of Soutron LMS directly on content that we are responsible for and a very large dataset at that. For the first time we have our own library of some significance to manage in-house.

The catalogue database itself is some 37Gb in size and includes metadata with abstracts. We are holding the ePUB files separately together with the Book cover images (about 22Gb worth). It is a very useful size of database to work with and to test various functions around the catalogue and thesaurus and export.

There are several automated processes on the server that perform continual loading of new titles and removal of titles that have been withdrawn. All of the loading and management is performed using automated scripts as are the fulfilment profiles, so the manpower to operate the systems is really minimal. De-duplication is a critical factor as multiple formats are provided and our service provider only wants to work with ePUB files. Publishers push out separate records for each type of ePUB file. Checks and rules are applied to validate quality of the metadata and especially pricing data.

Metadata & Indexing

It’s a big surprise to find that so much is wrong in the metadata. Starting with ISBNs, often hardbacks are included and presented as eISBNs. The big one though is the indexing of titles. Subject categories based on BISC are provided in the ONIX data feed from suppliers but these are so flat that the ability to create a meaningful hierarchy to explore and display is impossible. It wouldn’t be so bad but the term descriptions often bear no resemblance to the content of the eBook itself and a large number of travel books have abstracts of places quite different from where the author has been. So this has given us reason to really use the thesaurus and global edit in a way that we rarely see when testing our usual data set. The result is we have a “living” thesaurus that is addressing just about all topics published and a growing respect for indexers.

Discovery

The driver for all this back room work is Discovery. Making the content more easily discoverable and useful for those who are selling and marketing and introducing the texts to users. It is leading me to think that we need a more intelligent approach to indexing and I am keen to hear from anyone with ideas on how we might take that. There are one or two companies out there that are using AI to do this and it would be great to hear what you think of such approaches.

Do they work or is the nuancing of language too important to be left to algorithms!

Setting up these services needs custom bibliographies using the Export function (paper is important in Africa). This has been very easy to set up in Soutron and other than the banner header, I have pretty much total control over what I want to output without pushing it into Word. I am surprised no one has asked for this to be more flexible given the banner background is hard coded right now. I could also do with extra selectors to bring data out of the database, such as all the records that are published. I am pushing to build out a Dashboard to make filtering output and reporting simpler.

New records are continually coming into the database, at about 5,000 a month from the existing list of publishers, more will be added as we bring other publishers into the mix. Here’s looking to a million titles before the end of 2016.

Author
Graham Beastall – Senior Consultant and Managing Director. Graham’s background is in Accountancy, Public Administration and Organisational Theory with a deep technical understanding of databases and web technologies. More posts by Graham.
Stay informed

Enter your name and email below to get our latest articles delivered straight to your Inbox.



Your permission to stay in touch with Soutron
I consent to receiving future communications from Soutron, the latest information on our products and solutions for libraries, archives, knowledge and information centres by the following methods. Please tick: Email Telephone

Note: We respect your privacy at all times. You may unsubscribe at any time.

Like and Share this article today!
  • Facebook
  • Twitter
  • LinkedIn

Related Articles

Chartered Institute of MarketingChartered Institute of Marketing CIM Selects Soutron for Library and Knowledge Management Expand Corporate Knowledge Soutron Flexible DatabaseExpand Corporate Knowledge with Soutron’s Flexible Database Default ThumbnailCook, Yancey, King and Galloway, APLC simplify library operations with Soutron, replacing EOS/SirsiDynix Flexible Database equals Expanded Reach for you!Flexible Database Equals Expanded Reach for the Welsh Government Anonymise User Data General Data Protection Regulation GDPRAnonymise your user Data for General Data Protection Regulation – GDPR

Please add your comments Cancel reply

Latest Updates

  • Latest Software Update – Soutron Version 4.1.9

    Latest Software Update – Soutron Version 4.1.9

    24/03/2023
  • 5 Reasons Libraries Need Library Management Systems

    5 Reasons Libraries Need Library Management Systems

    22/03/2023
  • Preparing for the Future: Learning from Your LMS Transition

    Preparing for the Future: Learning from Your LMS Transition

    22/03/2023
  • Soutron Global Announces New Partner Program for their Customisable and Scalable Information Management System

    Soutron Global Announces New Partner Program for their Customisable and Scalable Information Management System

    24/02/2023
  • Soutron a Crown Commercial Service G-Cloud 13, Software as a Service (SaaS) Supplier

    Soutron a Crown Commercial Service G-Cloud 13, Software as a Service (SaaS) Supplier

    09/02/2023

Popular Topics

archive catalogue software (10) Archive Management (11) archive management experts (5) archive management software (5) archives (13) archiving (7) biall conference (4) ccs (5) christmas (6) cilip (13) combined library archive (4) crown commercial service (5) custom library solution (9) database tools (10) Digital futures (7) digital technology for library (20) document management (4) ebooks (4) goverment (5) ILS (6) information access (5) Information literacy (9) Information management (27) knowledge centres (7) legal library solution (7) librarian skills (10) library knowledge base (11) library management (46) library management software (7) library management solution (21) library management strategy (11) library management systems (15) library resources (9) library tools (12) lms (24) newsletter (4) press release (11) saas (5) search (5) software updates (11) soutron news (18) support (9) tools (5) transforming libraries (21) updates (10)

Categories

  • Industry News (59)
  • News (28)
  • Soutron Blog (154)
  • Soutron Company Updates (107)
  • Soutron Events (25)
  • Soutron Global Webinar (5)
  • Soutron Product News (57)

RSS Topics

The following links are for those who prefer to receive updates via our 'raw' RSS feeds:

RSS Feed for the Soutron Blog Soutron Blog
RSS Feed for Soutron Product News Soutron News
RSS Feed for Soutron Events Soutron Events
RSS Feed for Library and Archive Industry News Industry News

Great for users of Outlook and other readers. Find out more about our RSS Feeds

Previous Articles

Search Soutron

Whitepapers

Justify your budget and change the way you talk about your contribution to the bottom line. Claim your White Paper today.

Choose Soutron Software

Build an information management platform to deliver content to users and harness the knowledge capital locked into your organisation.

Our goal is to reduce the time you spend on administration, give you more time to serve users and control costs.
Find out more


CILIP - Chartered Institute of Library and Information Professionals

Find out more

Case Studies
Fact Sheets
White Papers
Online Demonstration
Authorised Partners
30+ Years of Soutron





Latest Articles

  • Latest Software Update – Soutron Version 4.1.9

  • 5 Reasons Libraries Need Library Management Systems

  • Preparing for the Future: Learning from Your LMS Transition

  • Soutron Global Announces New Partner Program for their Customisable and Scalable Information Management System

  • Soutron a Crown Commercial Service G-Cloud 13, Software as a Service (SaaS) Supplier

Follow Us

Get the latest updates by email


Visit the Soutron Global  Homepage
  • Home
  • Contact Us
  • Soutron Support
  • Technical Requirements

1989 - 2023 © Soutron Global Inc - All Rights Reserved | Terms & Privacy | Sitemap | Subscribe to our RSS Feed with Feedburner it's so easy!

Soutron are a Microsoft Silver Partner Soutron are Cyber Essentials Plus Certified! Find out more...