Sunday, July 13, 2025

Adventures in core.logic: learning clojure and logic programming with help from Gen AI

 


This past month my project has been to learn logic programming, and as a vehicle to do this, learn clojure (again).  For those who are not computer scientists, logic programming is one of the four main computer programming paradigms:  procedural (what most people learn in an introductory programming class), object oriented (what most computer science programs and professional programmers aim for, Java, C++, C#, Ruby are all examples of OO languages), functional programming (Lisp and its relatives), and logic programming.  The closest most people get to logic programming is SQL, which is declarative and works by expressing the outcome, but not the steps to get there.  The most well known language is Prolog.  A more recent expression of logic programming, is miniKanren, which is a Domain Specific Language originally implemented in Scheme, but there are other implementations, whose quality seems to be related to how well functional programming is implemented in those languages.  This essay looks at (1) learning clojure (a Lisp that runs on the java virtual machine, (2) learning logic programming (3) learning core.logic, which is the implementation of miniKanren on clojure, and (4) using Generative AI to help with all these things.

This is my second exposure to Clojure, which is a Lisp (a functional programming language) that runs on the Java Virtual Machine. The big draw is that it provides a functional programming way of working that allows use of all Java libraries.  As a data scientist, the advantage of functional programming is that this is a much better style of programming when doing data manipulation. For example, using R with the tidyverse is functional style programming in that you perform operations on data frames that return data frames, and this allows the use of piping/sequencing of functions that conform to this pattern. (Pandas in Python is a flawed version of this as not all functions in Pandas follows this rule)

My first run with Clojure was around 2014 (so says my Github timeline). At the time the Incanter project was trying to establish it as a data analysis environment on the JVM. With the goal of being used in corporate IT departments that had standardized on the JVM (which places obsticals to using Python or R).  And it was good enough that I had written a model and associated analysis in Clojure for an attempted startup (a clean implementation which was not done at any of our home organizations). But the Incanter project stalled. And more recently a broader effort to provide data analysis/scientific computing capabilities into Clojure shows promise. Scicloj.  One standard mantra that I can confirm.  Lisp makes the claim that it has very little syntax, it is easy to learn.  And I would agree. After almost 10 years, a short online course and a review of some books I had from 10 years ago I was pretty up to speed.  Because when everything is a list, the question then becomes what is the form of that list for the task/function/library at hand.  Which is easier than any other language that I work with where I have to learn the philosophy of every package I use. (or collection in the case of the tidyverse on R).  In addition, the tooling was easier. Visual Studio Code has the Calva extension, which makes working with Clojure projects automatic (pretty much anything on the Java virtual machine needs an IDE to handle the project setup, so a good IDE is essential.)

For learning logic programming, I started with some Prolog materials, because that would allow me to focus on the logic and thinking part (Prolog is also fairly sparse in syntax).   I got Adventure in Prolog by Dennis Merritt and followed along with implementing the Nani adventure game as well as the geneology exercise that was developed over the entire book.  But I was always going to move to miniKanren, becuase in any conceivable use, I would be integrating logic programming into something else.

My first two attempts to moving from Prolog to a programming language were with Julia and Clojure.  With Julia, there was Julog (which is attempt to follow Prolog patterns but in the Julia language). This seemed servicable, although all I did was the adventure game. Then I looked at the miniKanren projects.  All of them were the beginnings of an implementation, but not complete enough to do anythihng.  (Scheme and miniKanren both have a reputation for being the target of a budding language creator's first target because they are so simple to write, but then the said creator's attention goes somewhere else).  And even though I have also used Julia in the past, I basically had to learn it over again as it changes every version (I review books by computer publishers, so I have had a chance to look at Julia every now and then, and it does feel like I'm starting over again every time).

Clojure has the advantage the the main language is very stable (and since it is a Lisp it has the advantage of having seen the history of language decisions, good and bad).  They have a fun graphic where the show the history of the source code changing which looks like layers instead of comparable graphics for other language projects that look like landslides.  But the same cannot be said about core.logic.  When core.logic first came out it was a unique in the sense that it was an implementation of logic programming that was in a relatively mainstream computing environment (because logic programming makes a lot more sense on a Lisp type programming environment than on a Algol type object oriented/procedural programming environment).  So there are a lot of early tutorials. But around version 0.8.5 or so there was a major change in the core.logic library organization, and a sub library was created to hold all of the non-logic things. Which includes things like facts and data.  But this broke all of the tutorials. And like faddish things, noone updated their tutorials. So all of the tutorials that everyone points to was from 0.7.6 or so. So as I repeated the Adventure in Prolog exercises, the getting started introduction was easy, but I had to discover that there was a new way of doing things that involved actual data (as opposed to being logic exercises) and I redid the Nani adventure and the bird expert system using the new core.logic and core.logic.pldb structure.  

The bird expert system exercise was particularly difficult. I actually did not do this set of exercises when I went through the Adventure in Prolog book (because it did not actually start until about halfway through).  So I tried to start from someone else's Prolog solution.  And that completely failed.  So I used OpenAI's ChatGPT and Google Gemini to help me. So neither of them completely got it right, but they got me on the right track. So my solution does not look anything like the Prolog solution. And the types of mistakes that the Gen AI did were interesting.

Generative AI works by going through the training data (essentially the internet), and using the tokens (roughly a word, sometime part of a word and sometimes a phrase) in the query, identifies other uses of that set of tokens and comes up with a probability of options for the next token.  Then chooses the next token randomly based on the calculated probabilities. Then, including the token the Gen AI just added, repeats the same and get the next token. And repeats.  The randomness is what gives Gen AI its creativity instead of just being a search engine. But it also leads to mistakes, as the Gen AI does not actually understand any of its source texts, so it does not recognize the context of its sources or the fact that some sources may not actually go with others.

This gets more problamatic in a subject like core.logic, where the majority of the texts on the internet are out of date, in a breaking way. Normally I say that Gen AI is particularly good at computing related topics, but that is because of the vast quantity of material available on various message boards programmers and computing professionals frequent to ask questions and get them answered.  Clojure core.logic is very different, as there is not much material (Clojure is not one of the more common languages, and logic programming is also a small niche), and there are at least three different eras, which are not mutually compatable.  And since modern examples do not overwhelm historical ones in quantity, things get mixed together. 

Now, how big of a problem is this.  In my experiences using Generative AI to aid in programming (again, I am a data scientist, so I am interested in data type issues), Generative AI is good for giving programming structure and style (which is very useful, (re-)learning new APIs is time consuming), but it regularly gets logic and the model wrong. But as a scientist, logic and the model are things I am good at, so I don't mind examining code to correct the logic and model, I wanted the help in getting the thing into a running state!  This is why despite Microsoft reporting 40% error rates in Copilot generated code and OpenAI reporting 70% failure in software engineering project when using Gen AI, professional programmers still find Generative AI to be very useful.  It does get things like how to work with an API right, and has pretty good programming style (with appropriate commenting!)  But logic, which the Gen AI gets wrong, is something that any competent programmer does not mind doing themselves.

The key for using Generative AI is the same as other things. It is good for style and structure. Not so good for facts and logic. But that is what subject matter experts are good at. (and most subject matter experts are not so good at style and structure)  So a trained SME can play to a Gen AI strengths and deal with the weaknesses. But only if the human is paying attention to this. 

Next steps, repeating the Adventure in Prolog exercise, but using the Kanren library in Python,