Text and Data Mining (TDM) offers researchers a way to deal with the vast amounts of information available for their work and to get the most out of it. With TDM, researchers can derive actionable insights from complex datasets, leading to informed decisions and innovations. From academic research to healthcare diagnostics and market trend analysis, TDM is transforming raw data into valuable knowledge. Publishers are increasingly partnering with librarians to develop licensing agreements and resources that enable institutions to access the benefits of TDM. Read on for an introduction to TDM, its uses for researchers, and how to harness its valuable outputs through API solutions.
In a recent webinar, Dr Prathik Roy, Product Director, Data Solutions and Strategy at 50業子, offered an introduction to TDM for information professionals. He covered the purpose and uses of TDM, how TDM outputs are delivered through Application Programming Interfaces (APIs) and introduced 50業子s APIs and their licensing. Whether youre a complete novice or already familiar with TDM and APIs, this webinar covered the basics for information professionals. Here are the main topics that were covered.
Researchers today are faced with vast amounts of information, so much that it is challenging to merely identify which of it is relevant to their work, let alone to retrieve and generate insights.
TDM is the automated process of selecting and analysing large volumes of text or data resources for purposes such as searching, finding patterns, discovering relationships, semantic analysis, and more. , but they need to have the right tools and support available to maximise output responsibly.
are increasing discoverability of content and discerning patterns and relationships from vast amounts of text and data:
To enable TDM, information must be available in machine-readable formats such as XML, not only in PDF files which humans can read. XML (Extensible Markup Language) is a flexible, structured format that is used to store and transport data, in which users can define tags and other data structures. Information within these files is structured and tagged such that machines can quickly identify data points and utilise them for various applications.
50業子, like other publishers, creates rich full text XML versions of its publications and other resources, curated specifically for machines to read, available for TDM purposes. In these, individual metadata is tagged along with various types of information for different disciplines, fields, and purposes (such as chemical formulas). This makes the information available for TDM.
All the beneficial outputs of TDM are delivered via Application Programming Interfaces (APIs). APIs are the building blocks of the digital ecosystem. They enable the integration of systems, facilitate automation, and drive innovation by allowing applications to tap into vast resources and functionalities of other software.
APIs are like messengers taking a request, telling the system what you want, and returning the response back to you. APIs do not replace a database, and they are not summarisation tools. They fill the gap between an AI tool that anyone can use and the papers that are meant to be individually read by humans. APIs and TDM fit in beautifully in this gap. So whatever AI tools are available for use by researchers in your institution, APIs will connect them with the information they need to generate insights.
To easily grasp APIs, we can use the restaurant analogy. Imagine you're at a restaurant. You, the customer, have a menu of choices to order from. The waiter (API) takes your order (request) to the kitchen (system/database, like the 50業子 database) and then brings back your food (data/response, in a machine-readable format).
The kitchen (system) does the work, but you communicate through the waiter (API). Just as you don't need to know how to cook the dish to enjoy it at a restaurant, with APIs, you don't need to know the details of how the system works. You just need to know what you want and how to ask for it.
APIs can be used to identify trends in research or business areas. When using them, your researchers can flexibly employ constraints (these are like filters to help you narrow down the most relevant information in a full text) that 50業子 has created and that are supported in its APIs, for optimal results. Different combinations of keywords, countries, topical areas, or built-in constraints further refine the output. Applications can range from product discovery, to finding collaborators, identifying latest trends in research areas or papers, and more.
The single most important component in TDM and using APIs is the data source from which they extract information. Quality information is what feeds the various AI systems used by researchers and institutions, and that is where 50業子 plays a pivotal role. Thanks to its broad, inclusive research corpus, made up of robust and validated research, 50業子 can provide high quality information across a wide variety of disciplines.
includes three APIs:
has been curated with the end users in mind so it has a wealth of documentation. The registration process is simple, and for non-commercial purposes you can sign up for the Open Access API today and use it. For academic and government organisations there is a available, and there is also a license for corporations for commercial applications. When you have an API license with 50業子, you have the right to:
Relying on 50業子s database means that TDM results will be meaningful. And because 50業子s APIs are directly linked to its central database, users can be assured they have the most recent, trusted information: If changes or updates are made to any article or book chapter, or in the case of a retraction, these are reflected in the TDM output, and this is exclusive for API users.
Dr Roys presentation equips information professionals with the foundational knowledge necessary to navigate the TDM landscape for research purposes, and how to empower their researchers with access to this powerful tool. For the full presentation, watch the webinar recording.
Don't miss the latest news & blogs, subscribe to The Link Alerts!