GPT-3 vs. Data-to-text: What’s the Right Content Generation Technology for Your Business?

This article gives you an overview & introduction to e-commerce automation and the areas it affects.

AX Semantics Solutions Publishing Header

Do you want to generate content or do you want to know how text generators work? Then you've come to the right place! In the following, we explain how data-to-text and GPT-3 tools create text automatically, how they work and in which areas they are used.

Both GPT-3 and data-to-Text are NLG technologies. NLG means "Natural Language Generation" and refers to the automated generation of natural language text. The text is generated by a machine. NLG programs make it possible to produce large amounts of text very quickly. Even though both GPT-3 and data-to-Text are NLG technologies, there are major differences.

In this article we provide the following information:

What are GPT-3 and Data-to-Text and how do they differ?
Which AI content generation technology is right for which use case?
- Data-to-Text Use Cases
- GPT-3 Use Cases
How do GPT-3 and Data-to-Text work?
- Data-to-Text
- GPT-3
Advantages and Disadvantages of GPT-3 and Data-to-Text
Final words on text quality and scalability

What are GPT-3 and Data-to-Text and how do they differ?

Data-to-text: What it is and how it works

Data-to-text refers to the automated writing of natural language texts based on structured data. Structured data are attributes that are available in, for example, the form of tables. Examples of structured data include product features from a PIM system or match data of a soccer game - in other words, they contain information that can be used in texts. Thus, the user has control over the text result, can intervene in the text creation at any time, and make updates or adjustments. This control ensures text consistency, meaningfulness and quality. The texts are also customizable and scalable. This means that tools based on structured data can, for example, create hundreds of texts on products with variable details in just a few moments.

In addition, text creation is possible in multiple languages. So you can generate the same content in English AND German AND Italian AND many other languages.

GPT-3: What it is and how it works

The GPT in GPT-3 stands for "Generative Pre-trained Transformer." It is a language model that learns from existing text and can provide different ways to end a sentence. It has been trained with hundreds of billions of words, representing a significant portion of the internet - including the entire corpus of the English Wikipedia, countless books, and a dizzying number of webpages.

Unlike data-to-text, GPT-3 can only be used to generate individual texts, although this can be done quickly. However, the user has no control over the generated content.

GPT-3 does not yet support multilingualism to the same extent as data-to-text does, as you are only able to create texts in English OR German OR Italian.

A comparison of both technologies and their differentiators

Differentiators	Data-to-Text	GPT-3
Classification	Machine automated production of natural language texts	Speech production system, neural networks ("speech prediction model").
Usage	For creating large amounts of text based on structured data sets with variable details	To create a basic text, can simplify writing process
Based On	Based on structured data that need to be provided	Based on Deep Learning
Scalable content production process?	Yes	No
Functionality	Deterministic, based on rules and logics	Probabilistic approach, trained on very large amounts of text
Human Input	Needs setup from data and rules, then independent work	Generates texts, but these must be checked for quality
Text generation by	Generates texts by analyzing structured data	Generates texts according to the patterns recognized from texts
Languages	Multilingual, text creation possible in many different languages at the same time	Only one language at a time
Personalized content possible?	Yes	No
Use Cases	News, reports, product descriptions, category texts, … (content where structured data is available)	Any type of text (if sufficient training data is available)

Which AI content generation technology is right for which use case?

Which technology is preferable depends on the use case. While GPT-3 is suitable for gaining inspiration, or a basic framework for a continuous text, for example a blog post. Due to its scalability, data-to-text software finds application in companies that need a larger amount of texts.

Data-to-Text Use Cases

Data-to-text is used in industries such as banks and companies from the financial sector, the pharmaceutical sector, in media and publishing as well as companies in the large field of e-commerce.

For e-commerce companies, data-to-text is profitable because they can, for example, generate high quality product descriptions for many products with similar details - in different languages and in consistent quality. This can save time and money, as well as increase SEO visibility and conversion rates on product pages.
To write large amounts of text manually, such as thousands of product descriptions for an online store, is almost impossible. Especially when these texts have to be revised regularly to keep them up to date, for example, due to seasonal changes.
For this use case, data-to-text software is a great help. Because once the project is set up, it is sufficient to update the existing data. Then the text is updated immediately with one click, or new and unique texts are generated straight away. This relief allows copywriters and editors to spend more time on creative tasks and conceptual work.

For pharmaceutical and financial companies, for example, the software is interesting because texts can be created and automated from data or statistics.

GPT-3 Use Cases

GPT-3 tools can be especially helpful for brainstorming and as a source of inspiration. How valuable a text is and how much it needs to be edited usually depends on the topic. The more specific the topic, the more imprecise and meaningless the content seems, as users report in this Reddit thread.

This is due to GPT-3's lack of awareness and general knowledge. The technology can only pull existing knowledge from data sources and reproduce it in a text. It is unable to evaluate or filter the statements. As a result, the content may not make sense or may even contain swear words and provide ethically questionable statements.

Nevertheless, the use of GPT-3 can be quite worthwhile. For example, if the user is looking for a basic framework for his text or when he is struggling with writer's block and gains inspiration from the text created with GPT-3.

Also, if an extensive and detailed text is to be generated automatically on the basis of a small amount of input, the use of GPT-3 can be quite useful. Or in situations where having humans generate the text output is inefficient or impractical. An example for this is the use of a chatbot to answer recurring customer queries.

If you want to know more about the differences between (Chat)GPT and data-to-text, download our free pdf on the subject:

How do GPT-3 and Data-to-Text work?

Data-to-Text

1. The user feeds his structured data into the data-to-text software, e.g. in the form of Excel or CSV files. This data forms the basis for the texts.

2. The user defines rules and statements that determine the tone and style of subsequent texts. Thus, he always retains control over the structure, grammar and word choice of the texts.

3. The software creates scalable texts based on the data and rules.

This short video explains in 90 seconds how automated content generation with data-to-text technology works.

Here you can see a concrete example:

GPT-3

The first sentence in the example below was supplied to a GPT-3 software. The rest was generated by the software. A text about possible activities on hot days was requested. The message of the text clearly differs from the original statement:

GPT-3 can both rewrite and continue a text, as shown in our example. The software has analyzed the input and predicted how the text is most likely to continue, based on a text predictor. Thus, the hot summer day ends up being a day when everyone has to get up at 5 a.m. and clouds and rain begin to roll in. It quickly becomes clear that the desired sentences about possible activities on a hot day cannot be generated in this way. Thus, it is obvious that the resulting text can only serve as a basis for inspiration.

Advantages and disadvantages of GPT-3 and Data-to-Text

Of course, both technologies have their strengths and weaknesses. Both generate texts automatically, but as already discussed, each is suitable for different use cases.

Data-to-text is based on structured data in a machine-readable form. Therefore, tasks like storytelling, as well as writing blog posts or social media posts are left to humans. Here, as a basis for text creation, GPT-3 is a suitable alternative. This is because these type of texts cannot be meaningfully generated using data-to-text software.

While data-to-text is focused on the reality of the user through the input of data, GPT-3 is a neural network solution that generates language from texts and has no direct reference to the real world. Subsequently, additional editing is necessary in order to guarantee a good quality, and not least at all, to ensure the meaningfulness of the texts.

GPT-3 vs. Data-to-Text Advantages & Disadvantages

Get started!

Our experts will answer any questions you may have about our software, your potential use case, etc.

Book Your Consultation Call

Final words on text quality and scalability

Can GPT-3 write texts? The answer is "yes", but you have to be aware of its weaknesses!

It is true that the words used are generated without errors and the grammar rules are applied correctly. But far too often the meaning of the statements is lost, as our text examples have shown. This means that a GPT-3 text generator can write texts, but the result may be of such inferior quality that hardly any time is saved due to the absolute necessity for thorough text revision. In addition, it is impossible for the user to feed the GPT-3 generator with more knowledge (for example in the form of data) in order to improve it. However, in many cases, this is a crucial requirement for use. In such cases, data-to-text is suitable because it is fed with company-specific data and generates text based on this information. However, the model is highly dependent on data quality and is more time-consuming, since an initial project must be set up first.

FAQ

What is Natural Language Generation (NLG)?

Natural Language Generation (NLG) refers to the automated generation of natural language by a machine. As a subfomain of computational linguistics, the generation of content is a special form of artificial intelligence. Natural language generation is used in many sectors and for many purposes, such as e-commerce, financial services, and pharmacy sector. It is seen to be most effective to automate repetitive and time-intensive writing tasks like product descriptions, reports or personalized content. Read more about Natural Language Generation.

What is content automation?

Automated content generation with AX Semantics works with the help of Natural Language Generation (NLG) - a technology that generates high-quality and unique content on the basis of structured data that is
indistinguishable from manually written content. Text automation is used for generating product descriptions, category content, financial and sport reports or content for search engines websites. In a nutshell, it is used for all kinds of content that require large quantities and have a similar basic structure.

Who can benefit Natural Language Generation?

Natural Language Generation (NLG) software has been increasingly used worldwide by companies and individuals across a variety of fields. In the process, the content to be generated can be aimed at any target groups. Particularly, online marketing experts with large online stores use NLG software to automatically generate product descriptions or content for category pages.

Products

Industries

Differentiators

News

Services

Knowledge