In 1983, American professor of statistics and computer science Edward Tufte introduced the concept of data-ink ratio in his famous book, “The Visual Display of Quantitative Information.” Data-ink ratio is a pioneering disciplinary theory in data visualization that has been highly influential in recommended practices and teachings, with its excellency reflecting a minimalistic style of design. The art movement of Minimalism is noticeable in Tufte’s theory, even though he did not deliberately mention it. However, academic research shows that data-ink ratio generated mixed responses, resulting in the need for more complex frameworks in data visualization.
What is data-ink ratio?
Tufte’s Data-Ink Ratio is built on his preposition that “data graphics should draw the viewer’s attention to the sense and substance of the data, not to something else”. The conceptual formula is:
“Data ink” means what is absolutely necessary to show the data. The word “ink” is used because the theory was formulated in the age of print graphics dominance. The equivalent of “ink” in the digital world today is “pixels.” Data-ink ratio aims to find and maximize the share of informative elements out of the total elements used in a chart, raising a ratio of 1 as the highest goal. A visualization with such a ratio is the one that only contains elements that show the data, with no decorations or redundancies.
Tufte asserted this goal by articulating two erasing principles: “erase non-data-ink, within reason” and “erase redundant data-ink, within reason.” For these two types of ink, he coined the term “chartjunk.” The excessive elements to get rid of include decorations, background images, unnecessary colors, gridlines, axes, and tick marks. He said that the principles should be applied through editing and redesign. This led to the structuring of the process of data visualization as a cyclical one that ends when the most minimal output is reached.
The cycle goes as follows, in Tufte’s words:
Illustration by Islam Salahuddin
Minimalism in data visualization
Minimalism became one of the most important and influential movements during the professor’s young age in the 1960s. Minimalism is defined by Cambridge dictionary as “a style in art, design, and theater that uses the smallest range of materials and colors possible, and only very simple shapes or forms.” It is a kind of reductive abstract art that is characterized by “plain geometric configurations” that lack “any decorative… flourishes.”
That can be juxtaposed with what Tufte instructed under the data-ink ratio theory: to achieve a data visualization that is simple and, therefore, minimal. Tufte’s theory suggests that such simplicity can deliver a message that is clear, precise, and efficient. As a result, it could reduce the time required for the user to perceive the visualization.
One of the many posts surfacing the internet to exemplify applying Tufte’s reductive principles on a bar chart. | Image Source: Darkhorse Analytics
Tufte’s priorities match the needs of the fast-paced business world. In business, saving time means saving cost and therefore maximizing profit. It was in the nineties, less than a decade after Tufte’s theory, when the accumulation of information started to skyrocket with the advancement of the World Wide Web. Businesspeople seemed to be lacking even more time with more information to consider, and a minimalistic approach seemed like a solution.
Minimalism did not only impact data visualization, but its effect also reached almost every corner of the human computer interaction (HCI) field. For example, as the internet became more widespread, search engines started to develop and became so powerful that they began to impose their own rules of the game. Because of the nature of how search engines work, the minimalistic structured design of web pages was more attainable for the engines to read and rank, and therefore more preferable and reachable for the audience.
For similar benefits, like reach in webpages and efficiency in data visualizations, minimalistic design became so invasive in all computer interfaces over the years. Today, minimalistic design is usually described as “modern design” and recommended to every designer building a user-interaction system, from mobile apps to webpages to data visualizations.
On the left: Donald Judd’s minimalist installation, Untitled, 1969 (photo from: Guggenheim Museum, New York) | On the right: Tufte’s minimalist variation of a double-bar chart with error lines. | Collage by Islam Salahuddin
Reviews of data-ink ratio’s minimalism
Despite presenting his hypotheses as proven facts, Tufte had never empirically tested his promised achievements that can be reached by minimalistic design. It was not until the beginning of the nineties that academic research started to put the claims under the microscope.
The initial research findings struggled to prove the hypotheses. However, a major multi-experiment research paper in 1994 found that some non-data-ink and redundant data-ink, especially backgrounds and tick marks on the y-axis, may decrease accuracy and increase response time (which is proportionally bad). Meanwhile, the other ink that Tufte considered to be chartjunk and called for its removal whenever possible, like axis lines, was proved to increase performance in some cases. The negative effect was clear in some chartjunk types, like axis lines, but was less certain in others, like three-dimensional charts.
Such experiments were still built on the same proposition of Tufte that graphical excellency means clarity, precision, and efficiency but found that the relationship between data-ink ratio and excellency in that sense can hardly be linear as Tufte suggests. The paper states that “effects of ink are highly conditional on the features of the graph and task” and therefore “simple rules like Tufte’s will not suffice.” Instead of indicating that all non-data-ink and redundant data-ink should be erased, the authors call on data visualization designers to determine whether the use of any ink will facilitate or interfere with reading a graph, depending on its context.
Later research even questioned Tufte’s components of graphical excellency, especially the presumed all-cases importance of response time factor. An empirical paper in 2007 found that users may prefer non-minimalistic visualizations over Tufte’s minimalistic ones, partially because they may find the latter boring. This is a criticism that both Minimalism art and statistics face and a perception that Tufte tried to avert with his rule. Boredom should not be treated as a minor problem because it means less ability to induce attention. A visualization’s ability to generate attention is the gateway to the viewer’s perception in the first place.
Attention is one of the criteria that Tufte’s rule overlooks. Other significant factors are memorability and engagement. More advanced experiments in 2013 and 2015 re-asserted chartjunk as not always harmful. In some cases, it may even increase the memorability and engagement of a visualization. Attributes like color and human recognizable shapes, icons, and images can enhance memorability due to their ability to activate more parts of a viewer’s brain, leveraging its natural tendency towards what is familiar rather than what is just minimal.
Despite their popularity, chartjunk and similar terms also appear to be highly open to interpretation among practitioners. Interpretation can be affected by an individual’s circumstances that include culture, personal style, preferences, and views, as well as constraints of skills, tools, and user’s priorities, according to a discourse analysis that was published in 2022.
On the left: Frank Stella’s minimalist painting, title not known, 1967 (photo from Tate Modern) | On the right: Tufte’s minimalist variation of a multiple vertical box and whisker plot. | Collage made by Islam Salahuddin
How to make sense of the previous discussions
The growing body of research shows that data visualization is a task that can hardly be led by only a one-factor rule like data-ink ratio. It shows that even the simple task of choosing what elements to include or exclude in a visualization remains largely an uncharted territory and needs further examination.
However, one of the common underpinnings that all theoretical works share is a consideration for the importance of context in which a visualization is designed. To be fair, even Tufte himself did not ignore this consideration after all and emphasized that certain principles have to be adopted “within reason.” Asserting the “reasonability” factor, he deliberately mentions in the Data-Ink Maximization chapter of his book that maximizing data-ink “is but a single dimension of a complex and multivariate design task.” He recognized the possible existence of factors other than excellency that come into play, including “beauty,” even if he did not prioritize them.
Therefore, synthesizing all the critiques arising against Tufte’s rule of data-ink ratio appears to be possible by quoting Tufte himself. He said that determining to which extent the data-ink ratio should be maximized rests on statistical and aesthetic criteria.” This allows data visualization designers to figure out the sweet spot where a visualization delivers what it intends to and, at the same time, does not alienize itself for the sake of being minimal.
All in all, minimalism can be considered one of the means to design a great data visualization, but not a goal. After all, the goal will remain to deliver the intended message to the audience so they can perceive it best.
Want to understand how visual representations can support the decision-making process and allow quick transmission of information? Sign up for The KPI Institute’s Data Visualization Certification course.
Edward Tufte’s principles of data-Ink ratio have prevailed in data visualization since they were introduced in the 1980s. His theory has imposed a tendency towards a minimalistic style, defining excellence as clarity, precision, and efficiency and reducing the time users perceive information.
Meanwhile, academic research that has put the American pioneering statistician’s teachings to test does not show the linear relationship between data-ink ratio and visualization’s excellency. Further research shed light on other important criteria that Tufte overlooked, like the ability of a visualization to induce attention, memorability, and engagement. Overall, the academic body of literature has strongly suggested that no simple rule like data-ink ratio can suffice in data designing.
Debates among practitioners have been ongoing about the repeated notion of “less is more,” which leans back on Tufte’s teachings. Some believe that simplicity and quick perceiving should be the goals of all visualizations at all times. Others support embracing complexity and slow viewing time in some circumstances.
As a response to these debates, two interesting frameworks have emerged to suggest more criteria that should be considered. The first is “Levers of Chart-Making” by Andy Cotgreave, a senior data evangelist at Tableau, and the second is “Cognitive Load as a Guide” by Eva Sibinga and Erin Waldron, data science and visualization specialists.
Cotgreave suggested this under-formulation framework in the November 2022 edition of his newsletter The Sweet Spot. He put forward five scales of levers that “chart producers can use to enlighten, not bamboozle.” They are as follows:
Speed to primary insight – How fast or slow insight is intended to be extracted from a graph According to him, “it is ok to make charts that take time to understand”.
Granularity – How sparse or granular is the data that a chart intends to show?
Explore or explain – whether a visualization is intended to give the users the opportunity to explore the data themselves (like self-service dashboards) or to be accompanied by an explaining presentation
Dry or emotional – refers to how serious the way of presenting the data is versus how informal and relevant it is to non-data people. According to Cotgreave, an example of the serious approach is a normal column chart and for the emotional, a necklace of which the bead’s size represents the same underlying data.
Ambiguity vs. accuracy – For Cotgreave, there can be intended ambiguity in chart-making instead of clear accuracy.
Cognitive load is a more detailed and rigid framework that takes its inspiration from the psychology of instructional design. Suggested by Sibinga and Waldron, the framework was published by the Journal of the Data Visualization Society (Nightingale) in September 2021.
Cognitive load proposes 12 spectra, offering “an alternative to one-size-fits-all rules” and aiming to “encourage a more nuanced strategy” for data visualization. Divided into three categories, the spectra are supposed to “gauge the complexity of our data on one side, identify the needs of our audience on the other, and then calibrate our visualization to successfully bridge the gap between the two.”
Intrinsic load – This is the first group of spectra that is concerned with the data itself. It considers the inherent level of complexity in the data that a designer is tasked to explain with a visualization. The included spectra are:
Measurement (quantitative vs. qualitative) – According to the authors, quantitative data has less cognitive load (easier to perceive) than qualitative data. That is because the former usually has obvious measuring units, like dollars or miles, while the latter usually needs a conceptual rating scale, like satisfaction rate from 1 to 5.
Knowability (certain vs. uncertain) – Data collected from the whole population is easier to perceive than data estimated depending on a sample or predicted for the future. This is because the former usually has a high level of certainty that is easier to perceive than the uncertainty that comes with the latter, intertwined with its inevitable statistical margins of error.
Specificity (precise vs. ambiguous) – Undebated data categories, like blood type or zip codes, tend to be easier to perceive than socially determined concepts, like gender, race, and social class.
Relatability (concrete vs. abstract) – How relatable is the data to what humans see in everyday life? Concrete data would be small numbers like the cost of lunch and one’s age, while abstract data would be conceptual ones like GDP and the age of the earth.
Germane load – The second group of spectra is concerned with the audience and how ready they are to process the new information shown by a visualization. The included spectra are:
Connection (intentional vs. coincidental) – How will the audience have the first look at the visualization? Intentional viewers are likely better propped to perceive the visualization than viewers who stumble upon it by accident.
Pace (slow vs. fast) – Slow viewers are the ones that have more time in hand and therefore +have more ability to perceive a visualization (interpreting into lighter cognitive load).
Knowledge (expert vs. novice) – Expert viewers are the ones who are already familiar with the subject and therefore will have to afford lighter cognitive load when viewing a visualization.
Confidence (confident vs. anxious) – This spectrum addresses the intersection of the audience and the data reporting format. The cognitive load that is required from an audience familiar with the data reporting format, such as an interactive dashboard and a data-based report, will require lighter cognitive load than the one that is encountering such a channel for the first time.
Extraneous load – The final group addresses how new information is presented. The authors believe that these are the criteria where a designer has the most control and should therefore be considered last. The advice to determine a visualization’s place on the following spectra is by answering the question: “Given the existing intrinsic and germane loads, how much more cognitive load are we comfortable adding to the mix?”
Chart type (common vs. rare) – Chart types like bar charts need lighter cognitive load than uncommon ones, like violin charts or rose diagrams and the more innovative ones.
Interpretation (accurate vs. approximate) – Does the chart aim to deliver precise values or paint a wide picture? According to the authors, charts delivering specific values tend to take a lighter cognitive load than the ones dealing with overall objectives.
Composition (concise vs. detailed) – This spectrum assumes a high data-ink ratio and no chartjunk (from Tufte’s concepts) are already in place and then asks, how dense is the information on the page? Less dense visualizations require lighter cognitive load.
Delivery (explanatory vs. exploratory) – Does the data report explain itself, or is built to be explored? Exploration, naturally, takes more cognitive load than a self-explaining visualization.
How to make sense of all the previous discussions
Levers of chart-making and cognitive load as a guide are two of the recently suggested frameworks that offer a more complex approach to the task of data visualization. The two have similarities, like their consideration of complexity, granularity, and way of delivery. They differ from Tufte’s approach mainly through their acceptance of the need to slowly perceive designs in some circumstances. Cognitive load still deliberately pre-assumes applying data-ink ratio principles beforehand.
Therefore, no framework is likely to totally replace the others. At best, they tend to complement each other to cover the vast territory of the data visualization domain.
Data-ink ratio principles remain a good point to start as it best fits most business contexts. It can also help designers keep in mind the point of their design and avoid getting distracted amidst all the available software tools today. However, considering the emerging frameworks can make the practice more nuanced for tackling different needs, messages, and audiences.
The final determinant of how to incorporate the three frameworks -and any other emerging ones- in practice will remain to be the context of the visualization. A better understanding of the audience, the message, and the medium is key before using the different frameworks to decide on how information should be delivered.
Want to understand how visual representations can support the decision-making process and allow quick transmission of information? Sign up for The KPI Institute’s Data Visualization Certification course.
“If communication is more art than science, then it’s more sculpture than painting. While you’re adding to build your picture in painting, you’re chipping away at sculpting. And when you’re deciding on the insights to use, you’re chipping away everything you have to reveal the core key insights that will best achieve your purpose,” according to Craig Smith, McKinsey & Company’s client communication expert.
The same principle applies in the context of data visualization. Chipping away is important to not overdress data with complicated graphs, special effects, and excess colors. Data presentations with too many elements can confuse and overwhelm the audience.
Keep in mind that data must convey information. Allow data visualization elements to communicate and not to serve as a decoration. The simpler it is, the more accessible and understandable it is. “Less is more” as long as the visuals still convey the intended message.
Finding the parallel processes of exploratory and explanatory data visualization and the practice of sculpting could help improve how data visualization is done. How can chipping away truly add more clarity to data visualization?
Exploratory Visualization: Adding Lumps of Clay
Exploratory visualization is the phase where you are trying to understand the data yourself before deciding what interesting insights it might hold in its depths. You can hunt and polish these insights in the later stage before presenting them to your audience.
In this stage, you might end up creating maybe a hundred charts. You may create some of them to get a better sense of the statistical description of the data: means, medians, maximum and minimum values, and many more.
You can also recognize in exploratory if there are any interesting outliers and experience a few things to test relationships between different values. Out of the 100 hypotheses that you visually analyze to figure your way through the data in your hands, you may end up settling on two of them to work on and present to your audience.
In the parallel world of sculpting, artists do a similar thing. They start with an armature-like raw data in designing. Then, they continue to add up lumps of clay on it in exploratory visualizations.
Artists know for sure that a lot of this clay will end up out of the final sculpture. But they are aware that this accumulation of material is essential because it starts giving them a sense of ideal materialization. Also, adding enough material will ensure that they have plenty to work with when they begin shaping up their work.
In the exploratory stage, approaching data visualization as a form of sculpting may remind us to resist two common and fatal urges:
The urge to rush into the explanatory stage – Heading to the chipping away stage too early will lead to flawed results.
The urge to show all of what has been done in the exploratory stage to the audience, begrudging all the effort that we have put into it – When you feel that urge, remember that you don’t want to show your audience that big lump of clay; you want to show a beautified result.
Explanatory Visualization: Chipping Away the Unnecessary
Explanatory visualization is where you settle on the worth-reporting insights. You start polishing the visualizations to do what they are supposed to do, which is explaining or conveying the meaning at a glance.
The main goal of this stage is to ensure that there are no distractions in your visualization. Also, this stage makes sure that there are no unnecessary lumps of clay that hide the intended meaning or the envisioned shape.
In the explanatory stage, sculptors use various tools. But what they aim for is the same. They first begin furtherly shaping the basic form by taking away large amounts of material. It is to ensure they are on track. Then, they move to finer forming using more precise tools to carve in the shape features and others to add texture. The main question driving this stage for sculptors is, what uncovers the envisioned shape underneath?
In data visualization, you can try taking out each element in your visualization like titles, legends, labels, colors, and so on. Then, ask yourself the same question each time, does the visualization still convey its meaning?
If yes, keep that element out. If not, try to figure out what is missing and think of less distracting alternatives, if any. For example, do you have multiple categories that you need to name? Try using labels attached to data points instead of separate legends.
There are a lot of things that you can always take away to make your visualization less distracting and more oriented towards your goal. But to make the chipping away stage simpler, C there are five main things to consider according to Cole Nussbaumer Knaflic as cited in her well-known book, Storytelling with Data:
De-emphasize the chart title; to not drive more attention than it deserves
Remove chart border and gridlines
Send the x- and y-axis lines and labels to the background (Plus tip from me: Also consider completely taking them out)
Remove the variance in colors between the various data points
Label the data points directly
In the explanatory stage, approaching data visualization as a form of sculpting may remind us of how vital it is to keep chipping away the unnecessary parts to uncover what’s beneath, that what you intend to convey is not perfectly visible until you shape it up.
Overall, approaching data visualization as a form of sculpting may remind us of the true sole purpose of the practice and crystalize design in the best possible form.
Sign up for The KPI Institute’s Certified Data Visualization Professional course to learn the fundamentals of creating visual representations, the most effective layouts, channel selection, and reporting best practices.
Gone are the days when analyzing and visualizing data to get information was a job that was limited to the IT and business intelligence (BI) divisions. Gone also are the days when the sole possession of knowledge, skills, and tools for data processing was in the hands of the “data guy.”
Data is becoming more and more abundant and essential for various business operations. This makes centralizing data processing on one or two divisions an inevitable bottleneck. On the other hand, analytics and visualization tools are becoming easier to use, with more intuitive user-friendly interfaces that require less and less technical expertise.
What SSBI Is About
Self-service business intelligence (SSBI), also called self-service data exploration, has become an important approach for data-driven insights in business. It means giving the ability to the wide range of employees who are not experienced with data to drive insights from relevant datasets and create exploratory visualizations to help them better understand the data and to use it in reports. It’s also a part of what is called data democratization if you’d like another fancy term on the plate.
It should be, however, distinguished from the second approach called dashboarding. While the latter should still be the responsibility of the experienced BI team, turning amounts of data to finely curated reports on the most important KPIs within a well-developed narrative can happen. The SSBI approach aims to:
Avoid time delays in data-driven decision making among the low and mid-level teams that may happen due to the centralization of analytics responsibilities.
Minimize intuition-based decisions that can be made by low and mid-level teams on a daily basis due to lack of analytical capabilities.
Enhance internal communication within the teams by making data-driven insights and visualizations easier to generate, and therefore more frequent integration of reports.
Enhance external communication of the organization as the insights and visualizations can also be easily used in developing publications, like blog posts for example.
Google Sheets and Datawrapper
There are tons of visualization tools out there that can enable you to create an SSBI system for your organization, some of which are technologically advanced, but each has its best uses and downfalls.
Just like Google Sheets and Datawrapper. The advantages of using these tools are the following:
– Businesses with no capabilities of experienced teams or infrastructure can implement the system.
– Anyone can use it as it requires little to no technical expertise.
– Visualizations can be easily duplicated and edited, suiting fast-based work routines.
– Visualizations can be easily well-formatted and laid out, leading to efficient reporting.
– Generate both interactive and static visualizations that are suitable for embedding in various forms of reports, from web-based all the way to paper-based.
Using a self-service BI solution can help streamline operations and support critical decisions. It also encourages collaboration, simplifies daily business needs, and increases one’s competitive advantage. With the efficiency brought by SSBI, businesses can focus on what matters most to them.
Want to understand how visual representations can support the decision making process and allow quick transmission of information? Sign up for The KPI Institute’s Data Visualization Certification course.
We deal with data every day, especially at work. It can fuel our decisions and change the way we work. At the same time, if we’re surrounded by a huge amount of data, we may not find it easy to arrive at an optimal decision. This is where data visualization comes in.
Data visualization refers to the graphical representation of the data. It makes large amounts of information easier to understand and helps identify patterns and trends. People can easily comprehend information and make conclusions through data visualization.
“Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space,” wrote American statistician Edward R. Tuffe, author of the book “The Visual Display of Quantitative Information.”
Understanding how to approach data visualization allows people to equip themselves with the right tools, approach, and strategies as they gather data and present them visually. This is important to businesses who want to understand consumer behavior patterns or governments seeking data-backed insights on a crisis.
Data visualization may be considered a science because it is a process and represents data methodically and accurately. Data visualization begins with volumes of information, undergoes an intensive cleaning, classification, statistical and mathematical modeling, analysis, and design process, and ends with a visualization.
On the other hand, many argue that data visualization is a language because it uses diagrams to convey meaning. Data is encoded into symbology and semiology. The syntax and conventions of these diagrams are not inherent and must be learned.
Data visualization helps to communicate analytics results in pictures. In simple words, data visualization is the language of images. That is on par with the language of words both written and spoken and with the language of numbers and statistics.
Merging science and language
Science and language do not have to invalidate each other. Their elements can go hand in hand in data visualization.
In data visualization, the challenge is how to make more people take interest in scientifically processed data. Presenting appropriate and relevant information in an engaging format through design is what makes data visualization successful. Science processes and provides information based on certain objectives while design is a form of communication shaped by visual elements.
Combined, scientific data and design can generate meaning out of raw data. The end result of data visualization is almost always a story. In storytelling, the plot (design) won’t be able to progress without the characters (scientific data) and vice versa.
Ensuring that graphs and charts present meaningful results is important now more than ever. In MicroStrategy’s “2018 Global State of Enterprise Analytics,” 63% of data-driven organizations said that implementing analytics initiatives led to high efficiency and productivity while 57% said they became more effective in decision making.
With this, the challenge for organizations is to know how to structure, format, and present their graphical data that will allow them to make faster business decisions. Sign up for The KPI Institute’s Certified Data Visualization Professional course to learn the fundamentals of creating visual representations, the most effective layouts, channel selection, and reporting best practices.