Wednesday, April 17, 2024

On simplicity in data science communications

Everything should be made as simple as possible, but no simpler  paraphrase of A. Einstein

 I spent this past week at the 2024 INFORMS Analytics conference.  One of the major themes across both the speakers, the panels (of which I was a panelist), and conversations in the halls was soft skills for data analysts/Data Scientist.  Data scientists are subject to the same stereotype that is used with all those who are technical specialists, that they lack soft skills and this prevents our end stakeholders from understanding and being able to take advantage of the knowledge and capability we bring to our organizations.  And that the most commonly provided solution is to ask the specialists to simplify the delivery.  But, other than an excuse for those with business backgrounds to beat up on the geeks, I don't know if this is the right solution. A better direction would be, as John-Eric Bonilla described it, the data scientist acts as a translator, the person who takes the aggregated insight of the data and of the subject matter experts throughout the organization and translate that into the framework of the decision maker.  This is a tall order, but this is the reason that Drew Conway in his Data Science Venn Diagram gives subject matter expertise equal weight with the math/stats and the computer skills that get so much prominence in these discussions.

When I was deployed in Afghanistan, a brief that I prepared was being pushed up to the Commanding General, ISAF- Afghanistan. Members of the General's staff was present at the last brief, and their comment was that it was a good brief, but I needed to redo it in their format. And they gave me a highly specified template. Now, I could be judgmental and say that it met no conceivable definition of "simple", but I won't. Because I realized immediately that the template had two functions.  First, to shortstop any Powerpoint Rangers and save the General from Death by Powerpoint, because a commanding General in a combat zone is a busy person and does not have time for that.  But also because the purpose of the template was to present the information inside the framework that that particular General processes information for the purpose of making a decision.  And that specificity of presentation, that the recipient can process the information using the framework they have as an expert in their area and make a decision, is the goal of technical communications, such as data science.

The common recommendation to data scientists is that we need to simplify our work for presentation to our decision maker audience.  And the reason that we are given this message is that our decision maker audiences do not need or want our technical explanations and they cannot understand complex topics.  But this view of our decision maker stakeholders is demeaning.  So far, in my career I have found the decision makers that motivate my work to be intelligent, subject experts in their own right, and fully capable of understanding detail and nuance.  But the key is not to remove subtlety and detail (which is the reason this person is in the position of decision maker), but to present the subtlety and detail that is important.  Certainly, the tendency of technical experts to want to focus on the story of their work does not help either. The answer (IMHO) lies in the use of frameworks.  Every specialty community that I know of has frameworks that are used to organize and communicate information. Examples are the range of SITREP formats used by specific emergency response and military communities, the 9-line medevac report, frameworks used for reporting patient condition in specific circumstances in the medical community.  And individual leaders have developed a framework to make the decisions, even if this framework is masked in intuition. And in the ideal case, that record of good decisions is the reason they are in their position.

That makes the key of data communication is to understand the decision making framework used by experts in this situation. In the case of the then Commanding General - ISAF, this framework was formalized by the General's staff, so that all briefs going to him were presented in that framework. And the General can fit all of the provided information into his internal decision making framework.

When this framework has not been formalized, the key is direct communication between the data analyst and the decision maker (or surrogate). The data scientist needs to communicate with the decision maker, or someone who knows how the decision maker things (either intuitively, or because they are members of the same professional community who analyzes information in a standard way) to understand how the decision maker thinks.  Then, this identifies both the type of information and the criteria that will be used to make the decisions.  The data scientist task becomes either identifying the data needed to provide this information, or to use the data that is available to come as close as possible to the information required.  And this unlocks the value of the data scientist, without diminishing either the role or capability of the decision maker.

There are a few ways for this to fail.  First is from the data scientist side. Many technical experts have no desire to learn the decision maker process. This is often accompanied by beliefs that the technical facts make the needed action self evident. Then from the other side, there are those who think that they give commands to people and the people doing the work should be able to get it done without needed resources. Both fall under the heading of lack of communications between the analyst and the customer, which is universally known to be the most common cause of data project failure.  The role of the data analytics manager is to ensure that constant communications is maintained and to intervene if not. (There are managers who think their role is to be a broker. But this also breaks communications and does not help change the most common cause of failure in data analytics projects)

Is being a translator easy? No.  But I have found on my projects is that the data scientist is often the first person who realizes all of the people who are actually involved in an activity, because the data scientist is tracing all of the data elements. So the data scientist needs to learn everyone's language to get a good picture on what is actually happening, and communicate to the decision maker in a format the decision maker can understand. Yes, this is hard (80% of data projects fail, and while vendors use this to market products, those who investigate that number say it is mostly communications). But we are not the only people who have to take complex information and transmit it to a decision maker in a form they can understand and use to make decisions.  The UX community does this too.  And, often they do it well.  So can we.