GPT-3 vs. Data-to-text: What’s the Right Content Generation Technology for Your Business?

AX Semantics Natural Language Generation software is intuitive and quickly able to generate all the content needed to keep pace with your business needs. See how the automated language generation software works!
Read more
AX Semantics Solutions Publishing Header

Do you want to generate content or do you want to know how text generators work? Then you've come to the right place! In the following, we explain how data-to-text and GPT-3 tools create text automatically, how they work and in which areas they are used.

Both GPT-3 and data-to-Text are NLG technologies. NLG means "Natural Language Generation" and refers to the automated generation of natural language text. The text is generated by a machine. NLG programs make it possible to produce large amounts of text very quickly. Even though both GPT-3 and data-to-Text are NLG technologies, there are major differences.

What are GPT-3 and Data-to-Text and how do they differ?

Data-to-text: What it is and how it works

Data-to-text refers to the automated production of natural language texts based on structured data. Structured data are attributes that are available in, for example, the form of tables. Examples of structured data include product features from a PIM system or match data of a soccer game - in other words, they contain information that can be used in texts. Thus, the user has control over the text result, can intervene in the text creation at any time, and make updates or adjustments. This control ensures text consistency, meaningfulness and quality. The texts are also customizable and scalable. This means that tools based on structured data can, for example, create hundreds of texts on products with variable details in just a few moments.

In addition, text creation is possible in multiple languages. So you can generate the same content in English AND German AND Italian AND many other languages.

FAQ Whitepaper Title

Open questions around automated content generation?

Download our free white paper that answers frequently asked questions around content automation and the AX software.

GPT-3: What it is and how it works

The GPT in GPT-3 stands for "Generative Pre-trained Transformer." It is a language model that learns from existing text and can provide different ways to end a sentence. It has been trained with hundreds of billions of words, representing a significant portion of the internet - including the entire corpus of the English Wikipedia, countless books, and a dizzying number of webpages.

Unlike data-to-text, GPT-3 can only be used to generate individual texts, although this can be done quickly. However, the user has no control over the generated content. GPT-3 does not yet support multilingualism to the same extent as data-to-text does, as you are only able to create texts in English OR German OR Italian.

A comparison of both technologies and their differentiators

DifferentiatorsData-to-TextGPT-3
ClassificationMachine automated production of natural language textsSpeech production system, neural networks ("speech prediction model").
UsageFor creating large amounts of text based on structured data sets with variable detailsTo create a basic text, can simplify writing process
Based OnBased on structured data that need to be providedBased on Deep Learning
Scalable content production process?YesNo
FunctionalityDeterministic, based on rules and logicsProbabilistic approach, trained on very large amounts of text
Human InputNeeds setup from data and rules, then independent workGenerates texts, but these must be checked for quality
Text generation byGenerates texts by analyzing structured dataGenerates texts according to the patterns recognized from texts
LanguagesMultilingual, text creation possible in many different languages at the same timeOnly one language at a time
Personalized content possible?YesNo
Use CasesNews, reports, product descriptions, category texts, … (content where structured data is available)Any type of text
(if sufficient training data is available)

Which AI content generation technology is right for which use case?

Which technology is preferable depends on the use case. While GPT-3 is suitable for gaining inspiration, or a basic framework for a continuous text, for example a blog post. Due to its scalability, data-to-text software finds application in companies that need a larger amount of texts.

Data-to-Text Use Cases

Data-to-text is used in industries such as banks and companies from the financial sector, the pharmaceutical sector, in media and publishing as well as companies in the large field of e-commerce.

For e-commerce companies, data-to-text is profitable because they can, for example, generate high quality descriptions for many products with similar details - in different languages and in consistent quality. This can save time and money, as well as increase SEO visibility and conversion rates on product pages.
To write large amounts of text manually, such as thousands of product descriptions for an online store, is almost impossible. Especially when these texts have to be revised regularly to keep them up to date, for example, due to seasonal changes.
For this use case, data-to-text software is ideal. Because once the project is set up, it is sufficient to update the existing data. Then the text is updated immediately with one click, or new and unique texts are generated straight away. This relief allows copywriters and editors to spend more time on creative tasks and conceptual work.

For pharmaceutical and financial companies, for example, the software is interesting because texts can be created and automated from data or statistics.

Do you want to know how automated content generation works?

Create your free AX Semantics account and take an interactive tour of our software!

GPT-3 Use Cases

GPT-3 tools can be especially helpful for brainstorming and as a source of inspiration. How valuable a text is and how much it needs to be edited usually depends on the topic. The more specific the topic, the more imprecise and meaningless the content seems, as users report in this Reddit thread.

This is due to GPT-3's lack of awareness and general knowledge. The technology can only pull existing knowledge from data sources and reproduce it in a text. It is unable to evaluate or filter the statements. As a result, the content may not make sense or may even contain swear words and provide ethically questionable statements.

Nevertheless, the use of GPT-3 can be quite worthwhile. For example, if the user is looking for a basic framework for his text or when he is struggling with writer's block and gains inspiration from the text created with GPT-3.

Also, if an extensive and detailed text is to be generated automatically on the basis of a small amount of input, the use of GPT-3 can be quite useful. Or in situations where having humans generate the text output is inefficient or impractical. An example for this is the use of a chatbot to answer recurring customer queries.

How do GPT-3 and Data-to-Text work?

Data-to-Text

1. The user feeds his structured data into the data-to-text software, e.g. in the form of Excel or CSV files. This data forms the basis for the texts.2. The user defines rules and statements that determine the tone and style of subsequent texts. Thus, he always retains control over the structure, grammar and word choice of the texts.3. The software creates scalable texts based on the data and rules.
This short video explains in 90 seconds how automated content generation with data-to-text technology works.

Here you can see a concrete example:

GPT-3

The first sentence in the example below was supplied to a GPT-3 software. The rest was generated by the software. A text about possible activities on hot days was requested. The message of the text clearly differs from the original statement:

This is an example of a short text that was generated by a GPT-3 tool.

GPT-3 can both rewrite and continue a text, as shown in our example. The software has analyzed the input and predicted how the text is most likely to continue, based on a text predictor. Thus, the hot summer day ends up being a day when everyone has to get up at 5 a.m. and clouds and rain begin to roll in. It quickly becomes clear that the desired sentences about possible activities on a hot day cannot be generated in this way. Thus, it is obvious that the resulting text can only serve as a basis for inspiration.

Advantages and disadvantages of GPT-3 and Data-to-Text

Of course, both technologies have their strengths and weaknesses. Both generate texts automatically, but as already discussed, each is suitable for different use cases.

Data-to-text is based on structured data in a machine-readable form. Therefore, tasks like storytelling, as well as writing blog posts or social media posts are left to humans. Here, as a basis for text creation, GPT-3 is a suitable alternative. This is because these type of texts cannot be meaningfully generated using data-to-text software.

While data-to-text is focused on the reality of the user through the input of data, GPT-3 is a neural network solution that generates language from texts and has no direct reference to the real world. Subsequently, additional editing is necessary in order to guarantee a good quality, and not least at all, to ensure the meaningfulness of the texts.

GPT-3 vs. Data-to-Text Advantages & Disadvantages

Do you want to learn more?

Contact us and book a suitable appointment with one of our experts.

Final words on text quality and scalability

Can GPT-3 write texts? The answer is "yes", but you have to be aware of its weaknesses! It is true that the words used are generated without errors and the grammar rules are applied correctly. But far too often the meaning of the statements is lost, as our text examples have shown. This means that GPT-3 can generate texts, but the result may be of such inferior quality that hardly any time is saved due to the absolute necessity for thorough text revision. In addition, it is impossible for the user to feed the GPT-3 program with more knowledge (for example in the form of data) in order to improve it. However, in many cases, this is a crucial requirement for use. In such cases, data-to-text is suitable because it is fed with company-specific data and generates text based on this information. However, the model is highly dependent on data quality and is more time-consuming, since an initial project must be set up first.

envelopephone-handsetmap-marker linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram