Topic #1: De-embedding Drasil #3003

balacij · 2022-05-28T06:13:09Z

balacij
May 28, 2022
Collaborator

De-embedding Drasils Implementation

Towards "Drasil in Drasil"

Problem Statement

With each case study, we have a general understanding of what occurs during its runtime (and what we will ever do in it): create a "ChunkDB" -> register chunks in the ChunkDB -> repeatedly generate things using pre-written "generation directives" with a basic set of "input choices." The runtime-registered chunks rely on type information described in Haskell, but these types (and the information we have about the types) are not available at Drasils runtime -- they are hidden from the runtime, exposed only to the Haskell compiler. Re-creation of the type information at runtime requires delving into complex reflection (specifically relying on GHC now, if I understand correctly).

As we are compiling the current Drasil source code with deeply embedded instances of data, and then running each compiled case study, we know that each execution of the compiled binary will give us the same generated results with each run. Realistically, since all data we input is known at compile-time, GHC should be able to discard of the nearly the entire "runtime" by evaluating the whole program during compilation (other than the final IO performing action that either errors out, or dumps the final Docs onto the working directory, i.e., the data it dumps should be fully evaluated and in the compiled binary itself). Hence, compiling Drasil and its input information is peculiar. It appears that Drasil wants to become an interpreter for its input information rather than something that is compiled alongside its input information.

Purpose Statement

The purpose of this research is to understand what we are building in the Haskell implementation and what is needed of a host language through (i) building a Drasil language, and interpreter, (ii) re-writing the knowledge contained in Drasils Haskell implementation in the new language (ensuring that the same artifacts can also be generated), and (iii) describe Drasils syntax and interpreter in the new language such that we can generate the interpreter. In order for (iii) to be feasible, we will first need to (iv) bootstrap a Drasil language interpreter in another language.

Research Questions

What is Drasil? What does "Drasil in Drasil" entail?
What is essential to Drasils success in capturing, manipulating, and generating knowledge?
Can we design a (a) Drasil language that allows us to build out syntax for building DSLs and knowledge fragments, (b) an interpreter for the aforementioned language, and (c) have Drasil describe its own language and interpreter such that we can generate the interpreter?
How much can Drasil (or at least its ideology) change modern software development, packaging, and usage? Where will Drasil shine?

smiths · 2022-05-30T17:08:58Z

smiths
May 30, 2022
Collaborator

@balacij this looks like a great analysis to me. The research questions sound good. My slides for my talk tomorrow tell a part of the story on where Drasil will shine. There is a pattern in helpful computing tools - they capture information so that they can provide automation. Org mode captures rudimentary documentation information, like the section headings, and uses this to automatically create LaTeX and html documents with formatted section headings, along with a table of contents. Code generation in Maple works by capturing knowledge on symbolic manipulation through a computer algebra system. With this information, we can automate symbolic calculations and code generation. Drasil can capture information about physics, about math, about computing, about documentation, etc. The information that we capture can be used to automatically generate software artifacts. The limiting factor is the knowledge capture. The nice thing about Drasil in Drasil is that we can extend Drasil as we understand different branches of knowledge better.

2 replies

balacij May 30, 2022
Collaborator Author

Thank you! This idea might be particularly ambitious, and potentially impossible without a hybrid approach (i.e., relying on things being hard-coded into the interpreter while we [and if decide to], slowly, move them out).

Ah, that looks like a fun talk! By any chance, do you know if it will be recorded so that we can tune into it later?

smiths Jun 2, 2022
Collaborator

Thanks @balacij. The talk was fairly well received. At least a few people were looking up our web-page after I finished. 😄

The talk was broadcast for the virtual attendees, so they might have recorded that stream? Last year the talks were added to the program at some point after the event. If they post them this year, they'll likely be added to this year's program.

JacquesCarette · 2022-05-31T20:08:51Z

JacquesCarette
May 31, 2022
Maintainer

So I'm going to play devil's advocate here: I find that the "problem statement" is full of Drasil jargon and somehow describes everything in terms of low-level operational details. I'm not even sure I understand what the problem statement truly means!

Rather than relying on Drasil terminology for communicating the ideas, I think some concrete example(s) of what the problem really is would help a lot.

The purpose statement inherits the flaws of the problem statement.

I was really surprised by RQ1. It seemed to "come out of nowhere", i.e. to me the previous discussion wasn't at all leading to "Drasil in Drasil"! RQs 2-4 are nice 'speculative' questions, but a little too philosophical to make them good CS/SE research questions. It would be quite hard to know that they've been 'answered'.

1 reply

balacij Jun 1, 2022
Collaborator Author

Thank you, that's a very important role to take on! I think my target audience was basically myself here, leading to a lot of the jargon usage.

The problem statement is really about how we're building one large compiler that connects a plethora of other variably sized compilers. For example, we are connecting compilers that build "theories" [from anything] with compilers that take "theories" and build SRS documents/programs/etc. Of course, I'm skipping over a lot of the compilation steps/transformations between. The large compiler (Drasil) currently connects those other compilers by programming them together in Haskell, and embedding the ways we can connect the relevant DSLs & ASTs alike in Haskell. When I mentioned "ChunkDB"s and the runtime of the existing case study binaries, as they are, I meant to highlight that all required information is known statically at the time we compile each case study. I think that means that the case study binaries should have the entire residual programs we "generate" fully evaluated and "generated" at the time we compiled our case study binaries, instead of when we run the case study binaries.

So, ultimately, is there any benefit to making those internally held compilers (and compiler-related tools), external, along with the data they read in? What will we gain if we do? What will we lose if we do?

One of the immediate "gains" is that we can potentially make Drasil more interactive without needing to touch the Haskell code, and allow users to build and connect their own compilers for arbitrary things with Drasil as a basis for their compilation workflow.

There are many issues to think about regarding this, however, and building a single language that is meant to build and connect other compilers might not require us to create a new language at all -- we might even want to keep Haskell as the language of the external compilers!

Thinking vaguely about YACC, and how Agda allows us to define syntax at compile time, I think about what benefit there is for Drasil to make the compilers external.

cd155 · 2022-06-01T20:03:41Z

cd155
Jun 1, 2022
Collaborator

@balacij Is that Problem/Purpose Statements and Question (PPS&Q)? As I am learning this writing technique, typically, the problem is limited to 1-2 sentences.

1 reply

balacij Jun 1, 2022
Collaborator Author

Yes 😄

I'm trying to adhere to the guide which mentions that the actual statement should be a sentence or two, but it should be in a paragraph that contains information about the knowledge gap, context, etc.

peter-michalski · 2022-11-18T16:49:24Z

peter-michalski
Nov 18, 2022
Collaborator

I think that a detailed discussion of what 'Drasil in Drasil' entails, or even a an abstract vision of this, would be worthwhile.

I searched for 'Drasil in Drasil' in the search field and this discussion is one of the few pages that includes this phrase despite having heard it in discussions many times in the past year. This might be tacit knowledge at this point. I think it would benefit everyone if it was well documented. Thank you @balacij for creating this discussion and listing the "what does" question here.

The reason this came up is that I was looking through the Body.hs (and other example specific) files and I couldn't help but wonder why we have the user do some of the things that they do. For example, having them use some of the combinators that we have when constructing sentences. On one had I do understand why we do this for the sake of our recipes, capturing knowledge, and removing redundancy, but on the other hand, thinking about end product usability, I can't help but notice that this increases on-boarding time of new users. It is certainly a great process for Drasil related research and for the sake of students working on Drasil, but if in the long term we want Drasil to be effortlessly used by practitioners, then Drasil will probably have to do some of this translation from basic text input, at least in some cases. This led to me wondering if this is one of the issues that 'Drasil in Drasil' is meant to tackle. Just a thought.

7 replies

balacij Nov 18, 2022
Collaborator Author

I really like this last idea about the "compiler generator." For example, if someone wants to use Drasil as a compiler-studio, they could start of by rebuilding, say, a Java compiler in Drasil, and then they could generate the Java compiler according to the specification in Drasil. Then they can compile Java programs (great!). Now, let's say someone wants to compose another compiler over it, say the existing SmithEtAl SRS generator, then the input language of the generated compiler is changed, but the output remains focused on building Java programs. Then let's say someone wants to build another simpler variant of the SmithEtAl SRS generator, let's say they have some ideas that can be compile down to the SmithEtAl SRS, then they can change the input language to their needs (this might be good for different kinds of experts of different fields who might want to use different native language, technical language, extra abstractions for their needs, or just simplify the code they need to input concepts into the SmithEtAl SRS transformer.

balacij Nov 18, 2022
Collaborator Author

Asking myself a question: "Why not just do that compiler work internally in Drasil?"
It's purely for alternative modes of data entry and creating specialized tooling. The generated tooling might also have some benefit in terms of performance for compiling specific things. Additionally, by building the compiler specification in the, or some variant of, the SRS format, we can share specific compilers with people, which might bring down the barrier to entry of Drasil. Additionally, compilers become auditable as part of being built in Drasil against the SRS!

EDIT: Regarding the last point, we'd be making specific parts of Drasil more easily auditable for specific purposes.

balacij Nov 18, 2022
Collaborator Author

Additionally, by building the compiler specification in the, or some variant of, the SRS format, we can share specific compilers with people, which might bring down the barrier to entry of Drasil. Additionally, compilers become auditable as part of being built in Drasil against the SRS!

Here's why I went to the "What does Drasil in Drasil entail?" in the original post: If the program that we generate, with an SRS, is Drasil, then we have we've arrived at "Drasil in Drasil." So, if we were to de-embed now, then we would find what core "Drasil" is, and then be able to abstract it into "Drasil in Drasil," similar to how the current examples were abstracted from their host programs.

Of course, I still have to think about this more to clear up my thoughts. But, I'm also interested in if anyone has any thoughts on what I just said, if it even makes sense 😅

JacquesCarette Nov 24, 2022
Maintainer

That last post is exactly it. If we had a "Drasil black box" which had Haskell as an output language, and an SRS / SystemInformation written in the input language of that Drasil, and the output of the generator was equivalent to that Drasil (including all its artifacts, not just the Haskell code), that would be 'Drasil in Drasil'.

The key would be to have an accurate enough specification, in a good enough specification language, to be able to explain Drasil - to ourselves and to Drasil. That is a very high bar!

There are a couple of videos of related material that I'll post later. Neither of them talk about Drasil, but both are about using sets of DSLs to capture knowledge in a condensed form, and then generate conventional code from that.

JacquesCarette Nov 29, 2022
Maintainer

The two videos that I had in mind are:

From structured theories to efficient code in 6 easy steps - a talk I gave at Edingurgh on Oct. 18th.
An Alan Kay video - it might have been this one but also this one.

balacij · 2023-01-17T04:20:02Z

balacij
Jan 17, 2023
Collaborator Author

I've been trying using a tablet to take notes. One of the nice things about it is that I can share my scribbles. I wrote it in "Dark Mode" with "Samsung Notes," so it looks a bit off in a standard PDF reader without also enabling dark/night mode. This is my current re-write as of right now, I will eventually re-write it here too: De-embedding Drasil (DraDrasilsil).pdf

0 replies

balacij · 2023-02-09T03:01:32Z

balacij
Feb 9, 2023
Collaborator Author

When we write software, we're writing instructions that the computer interprets. Typically, we expect the computer to somehow give some sort of feedback while running and when it's done. We provide computers a means of giving us feedback by hooking them up to monitors, stereos, and other computers (external processing, networking with other computers, etc.).

This is wonderful. With software, we can quickly offload monotonous tasks to machines. For example, we might have them process data for us, sanitizing, and noting irregularities. We might also have it hooked up to other machines and have them work together to, say, move packages in shipping warehouses to specific locations for trucks to deliver them to their locations.

The machines just follow our instructions to the T.

The instructions are defined through assembly languages. Of course, programming in nearly any large, raw assembly language is stressful and difficult because we have to keep track of what CPU architecture is being used, what quirks exist, what optimizations you can make, managing memory, etc., all while also worrying about why we chose to build the program in the first place (i.e., the requirements of the program).

Procedure Capture

To make software development easier (and less stressful), we built programming languages that sit atop the assembler languages. By doing this, we also gained information about what that assembler was previously doing as a general procedure. In other words, we gained the ability to similarly translate those higher-level languages into other assembler languages. We obtained procedural reuse and came one step closer to “primarily focusing on the real issue at hand” -- the requirements that the software must satisfy.

For example, C is an abstraction over some assembly code (PDP 7?). But C is still too “close” to the machine. Manual memory management, garbage collection, and the likes are all cruft we don't want to worry about. They should also be a monotonous task because they're generally well-understood too (or we at least have general schemes we can follow).

Abstraction over Procedural Cruft

To remedy that situation, languages with automated garbage collection, memory management, etc. came around. For example, Java, Python, D, and Rust all handle different aspects of that same garbage collection and memory management issues. However, these languages remain procedure-oriented, abstracting over only the procedural cruft. These procedures are still the same steps and calculations that somehow compute solutions to the problems we were interested in, but with even less overhead and worry about maintaining the “machine.” So, these other languages are just abstractions/“smarter” versions of the same procedures. Does it affect how we think about the formulation of solutions? In the way that we think about the procedural solution, yes. But in the context of the focal problem knowledge, not really.

Similar to how we recognized the cruft of memory management and the likes, can we make converting that problem knowledge into the solution easier too?

Well, in a first attempt, procedures are shared through functions, libraries, frameworks, etc. However, these exhibit their own issues:

tied to a particular tooling,
don't translate well to other products,
don't always list out their key design decisions, assumptions, requirements, etc.,
aren't always extensible,
aren't always “safe” to use in your project without extensive auditing,
may not fit your needs (but you wouldn't know!),
not usable with other programming languages without complex FFIs,
implementation falls behind current “knowledge” of the problem quickly,
updates are tedious because they have to be done in potentially multiple spots despite 1 “knowledge” change,
there are repeated and similar programs, optimizations, etc. we can make but aren't able to without burdening developers with extra workload stress despite all of the similarities,
doesn't scale against language progress well (e.g., you can write wonderful programs in C, but switching to programming in Rust, despite a generally well-understood process for side-by-side code snippets, scales poorly when you try to translate C programs to Rust programs),
etc. (all are really reasons for Drasil, I won't go too far here)

These are all symptoms of manually transcribing software as a “view” of our requirements/problem knowledge, and not defining what it means for it to be a “view” and then generating the solution.

Succinctly, the issue lies in that we're still manually building the solution implementations of our well-understood problems!

Abstraction over Families of Problem Knowledge and Software Artifacts

In order to resolve these issues, we, again, need to look at the procedure as more “monotonous cruft” and see how we can abstract over them. Well, thankfully, we already have logical “models” (theories) that we think about when building the software. So, we look to capture that.

When building software, we have some sort of document that developers and product owners share that stores the requirements of the software. Developers interpret these requirements and form related solutions. Unfortunately, these requirements are typically textual and is only verified by the eyes of the implementors and creators.

However, all of this knowledge is well-understood.

[Aside: I'm not entirely sure what software you try to build without some sort of vague idea of the requirements, but in the event that it's possible, we will constrain the set of desired software to that well-understood.]

For a subset of well-understood [cite] knowledge, we have a means of auditing the software and the logic behind forming it. We should be able to reduce the whole process to just writing and auditing the logic now. Then, the “development” of the software would just be a particular view (e.g., generation up to design specifications and artifact configuration).

Here again, we have a reoccurring idea of “going up an abstraction,” but not quite in the same direction as earlier. Earlier, we were abstracting over procedural cruft. Here, we desire an abstraction over what was leftover -- the solution as it pertains to a problem. Similar to removing the procedural cruft of a particular language, we have to remove the equivalent “procedural cruft” (the problem/solution) by the family of problem/solutions it pertains to. Previously, we were removing the cruft of one particular language. Here, we remove the cruft of one particular software family. Previously, you could only do that for 1 language at a time. Here, you can apply it to any language, but only for kind of software. These abstractions are, in some sense, orthogonal.

[Aside: it might have been better for me to talk about “machine” vs “business” logic to get the point across.]

Back to the focus, to make this monotonous, we need to look at capturing the procedures one level “higher.” Drasil is a prime example of this in practice. It converts problem knowledge into related solutions.

Drasil

Drasil is a software artifact generation suite. Drasil uses DSLs to encode domain-specific knowledge, and create opportunity for domain-specific interpretations (this is largely an abstraction over the business-oriented logic, allowing us to remove ourselves one level further from the “code”). Drasil is deeply embedded in Haskell, and requires developers (users of Drasil) to use Haskell to encode their ideas. Unfortunately, asking developers to use Haskell is a bit of an uphill fight (P1) for many reasons (tooling maturity, stability, complexity compared to languages used in businesses, etc.).

So, how does Drasil really work?

Drasil has 5 “major” components (at least 3 are directly mentioned in #2883, but I believe there are two more now) that makes it special:

chunk types: it captures general “knowledge” using DSLs and classifies them with their structure through types (e.g., quantities, theories, equations, Java/Python/code..., systems of equations, constraints, etc.),
chunk transformers: it captures knowledge of the relationships we can form between DSLs (e.g., how can this quantity's symbol be rendered in LaTeX? How can this system of equations be calculated using a general PL? How can we present this theory of X in an alternative form?), and
chunk instances: it allows us to capture instances of chunk types (e.g., a vector representing the velocity of an arrow, law of conservation of energy, etc.),
compilers: it allows us to designate (a) a pool of chunk types as the types of “source code” it accepts, (b) a sequence of transformation steps that define what should happen to instances of (a), and (c) through (b), a definition of what the outputs of this compiler will be, and
a runtime: it allows us to run the compilers on a pool of gathered chunk instances, and we generally expect that (c) of (4) will be something that we can, relatively naively, dump onto the host machine.

So, what do these 5 things look like in Drasil?

Examples

Chunk Types

We build chunk types using record, and scaffold some nice “classy lenses” amongst other important typeclasses.

Drasil/code/drasil-lang/lib/Language/Drasil/Chunk/DefinedQuantity.hs

Lines 23 to 34 in 2178e68

    
           -- | DefinedQuantityDict is the combination of a 'Concept' and a 'Quantity'. 
        
           -- Contains a 'ConceptChunk', a 'Symbol' dependent on 'Stage', a 'Space', and maybe a 'UnitDefn'. 
        
           -- Used when we want to assign a quantity to a concept. Includes the space, symbol, and units for that quantity. 
        
           -- 
        
           -- Ex. A pendulum arm can be defined as a concept with a symbol (l), space (Real numbers), and units (cm, m, etc.). 
        
           data DefinedQuantityDict = DQD { _con :: ConceptChunk 
        
                                          , _symb :: Stage -> Symbol 
        
                                          , _spa :: Space 
        
                                          , _unit' :: Maybe UnitDefn 
        
                                          } 
        
           makeLenses ''DefinedQuantityDict

Drasil/code/drasil-lang/lib/Language/Drasil/Chunk/Unital.hs

Lines 25 to 60 in 2178e68

    
           -- | Similar to a `DefinedQuantityDict`, UnitalChunks are concepts 
        
           -- with quantities that must have a unit definition. 
        
           -- Contains 'DefinedQuantityDict's and a 'UnitDefn'. 
        
           -- 
        
           -- Ex. A pendulum arm is a tangible object with a symbol (l) and units (cm, m, etc.). 
        
           data UnitalChunk = UC { _defq' :: DefinedQuantityDict 
        
                                 , _uni :: UnitDefn 
        
                                 } 
        
           makeLenses ''UnitalChunk 
        
           -- | Finds 'UID' of the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance HasUID        UnitalChunk where uid = defq' . uid 
        
           -- | Finds term ('NP') of the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance NamedIdea     UnitalChunk where term = defq' . term 
        
           -- | Finds the idea contained in the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance Idea          UnitalChunk where getA (UC qc _) = getA qc 
        
           -- | Finds definition of the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance Definition    UnitalChunk where defn = defq' . defn 
        
           -- | Finds the domain contained in the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance ConceptDomain UnitalChunk where cdom = cdom . view defq' 
        
           -- | Finds the 'Space' of the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance HasSpace      UnitalChunk where typ = defq' . typ 
        
           -- | Finds the 'Symbol' of the 'DefinedQuantityDict' used to make the 'UnitalChunk'. 
        
           instance HasSymbol     UnitalChunk where symbol c = symbol (c^.defq') 
        
           -- | 'UnitalChunk's have a 'Quantity'. 
        
           instance Quantity      UnitalChunk where  
        
           -- | Finds the unit definition of a 'UnitalChunk'. 
        
           instance Unitary       UnitalChunk where unit = view uni 
        
           -- | Finds the units used to make the 'UnitalChunk'. 
        
           instance MayHaveUnit   UnitalChunk where getUnit = Just . view uni 
        
           -- | Finds the units used to make the 'UnitalChunk'. 
        
           instance TempHasUnit       UnitalChunk where findUnit = view uni    
        
           -- | Equal if 'UID's are equal. 
        
           instance Eq            UnitalChunk where c1 == c2 = (c1 ^. uid) == (c2 ^. uid) 
        
           -- | Convert the symbol of the 'UnitalChunk' to a 'ModelExpr'. 
        
           instance Express       UnitalChunk where express = sy

In order for us to divide Expr into Expr/ModelExpr/CodeExpr (in my Master's), we had to keep a copy (i.e., duplicates!) of the GADTs behind the scenes to retain pattern matching. The TTF encoding of the language is far better, but trying to pattern match on the 3 still results in code duplication.

Expr:

Drasil/code/drasil-lang/lib/Language/Drasil/Expr/Lang.hs

Lines 93 to 144 in 2178e68

    
           -- | Expression language where all terms are supposed to be 'well understood' 
        
           --   (i.e., have a definite meaning). Right now, this coincides with 
        
           --   "having a definite value", but should not be restricted to that. 
        
           data Expr where 
        
             -- | Brings a literal into the expression language. 
        
             Lit :: Literal -> Expr 
        
             -- | Takes an associative arithmetic operator with a list of expressions. 
        
             AssocA   :: AssocArithOper -> [Expr] -> Expr 
        
             -- | Takes an associative boolean operator with a list of expressions. 
        
             AssocB   :: AssocBoolOper  -> [Expr] -> Expr 
        
             -- | C stands for "Chunk", for referring to a chunk in an expression. 
        
             --   Implicitly assumes that the chunk has a symbol. 
        
             C        :: UID -> Expr 
        
             -- | Function applications. 
        
             FCall    :: UID -> [Expr] -> Expr 
        
             -- | For multi-case expressions, each pair represents one case. 
        
             Case     :: Completeness -> [(Expr, Relation)] -> Expr 
        
             -- | Represents a matrix of expressions. 
        
             Matrix   :: [[Expr]] -> Expr 
        
             -- | Unary operation for most functions (eg. sin, cos, log, etc.). 
        
             UnaryOp       :: UFunc -> Expr -> Expr 
        
             -- | Unary operation for @Bool -> Bool@ operations. 
        
             UnaryOpB      :: UFuncB -> Expr -> Expr 
        
             -- | Unary operation for @Vector -> Vector@ operations. 
        
             UnaryOpVV     :: UFuncVV -> Expr -> Expr 
        
             -- | Unary operation for @Vector -> Number@ operations. 
        
             UnaryOpVN     :: UFuncVN -> Expr -> Expr 
        
             -- | Binary operator for arithmetic between expressions (fractional, power, and subtraction). 
        
             ArithBinaryOp :: ArithBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for boolean operators (implies, iff). 
        
             BoolBinaryOp  :: BoolBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for equality between expressions. 
        
             EqBinaryOp    :: EqBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for indexing two expressions. 
        
             LABinaryOp    :: LABinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for ordering expressions (less than, greater than, etc.). 
        
             OrdBinaryOp   :: OrdBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for @Vector x Vector -> Vector@ operations (cross product). 
        
             VVVBinaryOp   :: VVVBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for @Vector x Vector -> Number@ operations (dot product). 
        
             VVNBinaryOp   :: VVNBinOp -> Expr -> Expr -> Expr 
        
             -- | Binary operator for @Expr x Vector -> Vector@ operations (scaling). 
        
             NVVBinaryOp   :: NVVBinOp -> Expr -> Expr -> Expr 
        
             -- | Operators are generalized arithmetic operators over a 'DomainDesc' 
        
             --   of an 'Expr'.  Could be called BigOp. 
        
             --   ex: Summation is represented via 'Add' over a discrete domain. 
        
             Operator :: AssocArithOper -> DiscreteDomainDesc Expr Expr -> Expr -> Expr 
        
             -- | A different kind of 'IsIn'. A 'UID' is an element of an interval. 
        
             RealI    :: UID -> RealInterval Expr Expr -> Expr

ModelExpr:

Drasil/code/drasil-lang/lib/Language/Drasil/ModelExpr/Lang.hs

Lines 86 to 152 in 2178e68

    
           -- | Expression language where all terms are supposed to have a meaning, but 
        
           --   that meaning may not be that of a definite value. For example, 
        
           --   specification expressions, especially with quantifiers, belong here. 
        
           data ModelExpr where 
        
             -- | Brings a literal into the expression language. 
        
             Lit       :: Literal -> ModelExpr 
        
             -- | Introduce Space values into the expression language. 
        
             Spc       :: Space -> ModelExpr 
        
             -- | Takes an associative arithmetic operator with a list of expressions. 
        
             AssocA    :: AssocArithOper -> [ModelExpr] -> ModelExpr 
        
             -- | Takes an associative boolean operator with a list of expressions. 
        
             AssocB    :: AssocBoolOper  -> [ModelExpr] -> ModelExpr 
        
             -- | Derivative syntax is: 
        
             --   Type ('Part'ial or 'Total') -> principal part of change -> with respect to 
        
             --   For example: Deriv Part y x1 would be (dy/dx1). 
        
             Deriv     :: Integer -> DerivType -> ModelExpr -> UID -> ModelExpr 
        
             -- | C stands for "Chunk", for referring to a chunk in an expression. 
        
             --   Implicitly assumes that the chunk has a symbol. 
        
             C         :: UID -> ModelExpr 
        
             -- | Function applications. 
        
             FCall     :: UID -> [ModelExpr] -> ModelExpr 
        
             -- | For multi-case expressions, each pair represents one case. 
        
             Case      :: Completeness -> [(ModelExpr, ModelExpr)] -> ModelExpr 
        
             -- | Represents a matrix of expressions. 
        
             Matrix    :: [[ModelExpr]] -> ModelExpr 
        
             -- | Unary operation for most functions (eg. sin, cos, log, etc.). 
        
             UnaryOp       :: UFunc -> ModelExpr -> ModelExpr 
        
             -- | Unary operation for @Bool -> Bool@ operations. 
        
             UnaryOpB      :: UFuncB -> ModelExpr -> ModelExpr 
        
             -- | Unary operation for @Vector -> Vector@ operations. 
        
             UnaryOpVV     :: UFuncVV -> ModelExpr -> ModelExpr 
        
             -- | Unary operation for @Vector -> Number@ operations. 
        
             UnaryOpVN     :: UFuncVN -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for arithmetic between expressions (fractional, power, and subtraction). 
        
             ArithBinaryOp :: ArithBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for boolean operators (implies, iff). 
        
             BoolBinaryOp  :: BoolBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for equality between expressions. 
        
             EqBinaryOp    :: EqBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for indexing two expressions. 
        
             LABinaryOp    :: LABinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for ordering expressions (less than, greater than, etc.). 
        
             OrdBinaryOp   :: OrdBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Space-related binary operations. 
        
             SpaceBinaryOp :: SpaceBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Statement-related binary operations. 
        
             StatBinaryOp  :: StatBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for @Vector x Vector -> Vector@ operations (cross product). 
        
             VVVBinaryOp   :: VVVBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for @Vector x Vector -> Number@ operations (dot product). 
        
             VVNBinaryOp   :: VVNBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Binary operator for @Number x Vector -> Vector@ operations (scaling). 
        
             NVVBinaryOp   :: NVVBinOp -> ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | Operators are generalized arithmetic operators over a 'DomainDesc' 
        
             --   of an 'Expr'.  Could be called BigOp. 
        
             --   ex: Summation is represented via 'Add' over a discrete domain. 
        
             Operator :: AssocArithOper -> DomainDesc t ModelExpr ModelExpr -> ModelExpr -> ModelExpr 
        
             -- | A different kind of 'IsIn'. A 'UID' is an element of an interval. 
        
             RealI    :: UID -> RealInterval ModelExpr ModelExpr -> ModelExpr 
        
             -- | Universal quantification 
        
             ForAll   :: UID -> Space -> ModelExpr -> ModelExpr

CodeExpr:

Drasil/code/drasil-code-base/lib/Language/Drasil/Code/Expr.hs

Lines 71 to 143 in 2178e68

    
           -- * CodeExpr 
        
           -- | Expression language where all terms also denote a term in GOOL 
        
           --   (i.e. translation is total and meaning preserving). 
        
           data CodeExpr where 
        
             -- | Brings literals into the expression language. 
        
             Lit      :: Literal -> CodeExpr 
        
             -- | Takes an associative arithmetic operator with a list of expressions. 
        
             AssocA   :: AssocArithOper -> [CodeExpr] -> CodeExpr 
        
             -- | Takes an associative boolean operator with a list of expressions. 
        
             AssocB   :: AssocBoolOper  -> [CodeExpr] -> CodeExpr 
        
             -- | C stands for "Chunk", for referring to a chunk in an expression. 
        
             --   Implicitly assumes that the chunk has a symbol. 
        
             C        :: UID -> CodeExpr 
        
             -- | A function call accepts a list of parameters and a list of named parameters. 
        
             --   For example 
        
             -- 
        
             --   * F(x) is (FCall F [x] []). 
        
             --   * F(x,y) would be (FCall F [x,y]). 
        
             --   * F(x,n=y) would be (FCall F [x] [(n,y)]). 
        
             FCall    :: UID -> [CodeExpr] -> [(UID, CodeExpr)] -> CodeExpr 
        
             -- | Actor creation given 'UID', parameters, and named parameters. 
        
             New      :: UID -> [CodeExpr] -> [(UID, CodeExpr)] -> CodeExpr 
        
             -- | Message an actor: 
        
             -- 
        
             --   * 1st 'UID' is the actor, 
        
             --   * 2nd 'UID' is the method. 
        
             Message  :: UID -> UID -> [CodeExpr] -> [(UID, CodeExpr)] -> CodeExpr 
        
             -- | Access a field of an actor: 
        
             -- 
        
             --   * 1st 'UID' is the actor, 
        
             --   * 2nd 'UID' is the field. 
        
             Field    :: UID -> UID -> CodeExpr 
        
             -- | For multi-case expressions, each pair represents one case. 
        
             Case     :: Completeness -> [(CodeExpr, CodeExpr)] -> CodeExpr 
        
             -- | Represents a matrix of expressions. 
        
             Matrix   :: [[CodeExpr]] -> CodeExpr 
        
             -- | Unary operation for most functions (eg. sin, cos, log, etc.). 
        
             UnaryOp       :: UFunc -> CodeExpr -> CodeExpr 
        
             -- | Unary operation for @Bool -> Bool@ operations. 
        
             UnaryOpB      :: UFuncB -> CodeExpr -> CodeExpr 
        
             -- | Unary operation for @Vector -> Vector@ operations. 
        
             UnaryOpVV     :: UFuncVV -> CodeExpr -> CodeExpr 
        
             -- | Unary operation for @Vector -> Number@ operations. 
        
             UnaryOpVN     :: UFuncVN -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for arithmetic between expressions (fractional, power, and subtraction). 
        
             ArithBinaryOp :: ArithBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for boolean operators (implies, iff). 
        
             BoolBinaryOp  :: BoolBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for equality between expressions. 
        
             EqBinaryOp    :: EqBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for indexing two expressions. 
        
             LABinaryOp    :: LABinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for ordering expressions (less than, greater than, etc.). 
        
             OrdBinaryOp   :: OrdBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for @Vector x Vector -> Vector@ operations (cross product). 
        
             VVVBinaryOp   :: VVVBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for @Vector x Vector -> Number@ operations (dot product). 
        
             VVNBinaryOp   :: VVNBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Binary operator for @Number x Vector -> Vector@ operations (scaling). 
        
             NVVBinaryOp   :: NVVBinOp -> CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | Operators are generalized arithmetic operators over a 'DomainDesc' 
        
             --   of an 'Expr'.  Could be called BigOp. 
        
             --   ex: Summation is represented via 'Add' over a discrete domain. 
        
             Operator :: AssocArithOper -> DiscreteDomainDesc CodeExpr CodeExpr -> CodeExpr -> CodeExpr 
        
             -- | The expression is an element of a space. 
        
             -- IsIn     :: Expr -> Space -> Expr 
        
             -- | A different kind of 'IsIn'. A 'UID' is an element of an interval. 
        
             RealI    :: UID -> RealInterval CodeExpr CodeExpr -> CodeExpr

Chunk Transformers

Sometimes we define them as free floating “functions”:

[Aside: I think that “constructors” are also transformers. Really, almost all of our functions are transformers. The important thing is that they somehow involve chunks and don't do any sort of IO operations.]

Drasil/code/drasil-lang/lib/Language/Drasil/Chunk/Unital.hs

Lines 64 to 79 in 2178e68

    
           -- | Used to create a 'UnitalChunk' from a 'Concept', 'Symbol', and 'Unit'. 
        
           uc :: (Concept c, IsUnit u) => c -> Symbol -> Space -> u -> UnitalChunk 
        
           uc a sym space c = UC (dqd (cw a) sym space un) un 
        
            where un = unitWrapper c 
        
           -- | Similar to 'uc', except it builds the 'Concept' portion of the 'UnitalChunk' 
        
           -- from a given 'UID', term, and definition (as a 'Sentence') which are its first three arguments. 
        
           uc' :: (IsUnit u) => String -> NP -> Sentence -> Symbol -> Space -> u -> UnitalChunk 
        
           uc' i t d sym space u = UC (dqd (dccWDS i t d) sym space un) un 
        
            where un = unitWrapper u 
        
           -- | Similar to 'uc', but 'Symbol' is dependent on the 'Stage'. 
        
           ucStaged :: (Concept c, IsUnit u) => c ->  (Stage -> Symbol) -> 
        
             Space -> u -> UnitalChunk 
        
           ucStaged a sym space u = UC (dqd' (cw a) sym space (Just un)) un 
        
            where un = unitWrapper u

Anything from: https://github.com/JacquesCarette/Drasil/tree/master/code/drasil-printers/lib/Language/Drasil/Printing/Import

Drasil/code/drasil-printers/lib/Language/Drasil/Printing/Import/Helpers.hs

Lines 55 to 59 in 2178e68

    
           -- | Given the stage of the symbol, looks up a character/symbol 
        
           -- inside a chunk database that matches the given 'UID'.  
        
           lookupC :: Stage -> ChunkDB -> UID -> Symbol 
        
           lookupC Equational     sm c = eqSymb   $ symbResolve sm c 
        
           lookupC Implementation sm c = codeSymb $ symbResolve sm c

Drasil/code/drasil-printers/lib/Language/Drasil/Printing/Import/Sentence.hs

Lines 21 to 56 in 2178e68

    
           -- | Translates 'Sentence' to the printable representation of a 'Sentence' ('Spec'). 
        
           spec :: PrintingInformation -> Sentence -> P.Spec 
        
             -- make sure these optimizations are clear 
        
           spec sm (EmptyS :+: b)          = spec sm b 
        
           spec sm (a :+: EmptyS)          = spec sm a 
        
           spec sm (a :+: b)               = spec sm a P.:+: spec sm b 
        
           spec _ (S s)                    = either error P.S $ checkValidStr s invalidChars 
        
             where invalidChars = ['<', '>', '\"', '&', '#', '$', '%', '&', '~', '^', '\\', '{', '}'] 
        
           spec _ (Sy s)                   = P.E $ pUnit s 
        
           spec _ Percent                  = P.E $ P.MO P.Perc 
        
           spec _ (P s)                    = P.E $ symbol s 
        
           spec sm (SyCh s)                = P.E $ symbol $ lookupC (sm ^. stg) (sm ^. ckdb) s 
        
           spec sm (Ch TermStyle caps s)   = spec sm $ lookupT (sm ^. ckdb) s caps 
        
           spec sm (Ch ShortStyle caps s)  = spec sm $ lookupS (sm ^. ckdb) s caps 
        
           spec sm (Ch PluralTerm caps s)  = spec sm $ lookupP (sm ^. ckdb) s caps 
        
           spec sm (Ref u EmptyS notes) = 
        
             let reff = refResolve u (sm ^. ckdb . refTable) in 
        
             case reff of  
        
               (Reference _ (RP rp ra) sn) -> 
        
                 P.Ref P.Internal ra (spec sm $ renderShortName (sm ^. ckdb) rp sn) 
        
               (Reference _ (Citation ra) _) -> 
        
                 P.Ref (P.Cite2 (spec sm (renderCitInfo notes)))    ra (spec sm $ S ra)  
        
               (Reference _ (URI ra) sn) -> 
        
                 P.Ref P.External    ra (spec sm $ renderURI sm sn) 
        
           spec sm (Ref u dName notes) = 
        
             let reff = refResolve u (sm ^. ckdb . refTable) in 
        
             case reff of  
        
               (Reference _ (RP _ ra) _) -> 
        
                 P.Ref P.Internal ra (spec sm dName) 
        
               (Reference _ (Citation ra) _) -> 
        
                 P.Ref (P.Cite2 (spec sm (renderCitInfo notes)))   ra (spec sm dName)  
        
               (Reference _ (URI ra) _) -> 
        
                 P.Ref P.External    ra (spec sm dName) 
        
           spec sm (Quote q)          = P.Quote $ spec sm q 
        
           spec _  EmptyS             = P.EmptyS 
        
           spec sm (E e)              = P.E $ modelExpr e sm

Drasil/code/drasil-printers/lib/Language/Drasil/Printing/Import/Expr.hs

Lines 111 to 138 in 2178e68

    
           -- | Translate Exprs to printable layout AST. 
        
           expr :: Expr -> PrintingInformation -> P.Expr 
        
           expr (Lit l)                  sm = literal l sm 
        
           expr (AssocB And l)           sm = assocExpr P.And (precB And) l sm 
        
           expr (AssocB Or l)            sm = assocExpr P.Or (precB Or) l sm 
        
           expr (AssocA AddI l)          sm = P.Row $ addExpr l AddI sm 
        
           expr (AssocA AddRe l)         sm = P.Row $ addExpr l AddRe sm 
        
           expr (AssocA MulI l)          sm = P.Row $ mulExpr l MulI sm 
        
           expr (AssocA MulRe l)         sm = P.Row $ mulExpr l MulRe sm 
        
           expr (C c)                    sm = symbol $ lookupC (sm ^. stg) (sm ^. ckdb) c 
        
           expr (FCall f [x])            sm = 
        
             P.Row [symbol $ lookupC (sm ^. stg) (sm ^. ckdb) f, parens $ expr x sm] 
        
           expr (FCall f l)              sm = call sm f l 
        
           expr (Case _ ps)              sm = 
        
             if length ps < 2 
        
               then error "Attempting to use multi-case expr incorrectly" 
        
               else P.Case (zip (map (flip expr sm . fst) ps) (map (flip expr sm . snd) ps)) 
        
           expr (Matrix a)               sm = P.Mtx $ map (map (`expr` sm)) a 
        
           expr (UnaryOp Log u)          sm = mkCall sm P.Log u 
        
           expr (UnaryOp Ln u)           sm = mkCall sm P.Ln u 
        
           expr (UnaryOp Sin u)          sm = mkCall sm P.Sin u 
        
           expr (UnaryOp Cos u)          sm = mkCall sm P.Cos u 
        
           expr (UnaryOp Tan u)          sm = mkCall sm P.Tan u 
        
           expr (UnaryOp Sec u)          sm = mkCall sm P.Sec u 
        
           expr (UnaryOp Csc u)          sm = mkCall sm P.Csc u 
        
           expr (UnaryOp Cot u)          sm = mkCall sm P.Cot u 
        
           expr (UnaryOp Arcsin u)       sm = mkCall sm P.Arcsin u 
        
           expr (UnaryOp Arccos u)       sm = mkCall sm P.Arccos u

There are also some which are defined through typeclasses and that it might be best to gather them all into typeclasses so that we invert the current dependencies (I discussed this further in #2873 and #2896).

Drasil/code/drasil-theory/lib/Theory/Drasil/MultiDefn.hs

Lines 75 to 78 in 2178e68

    
           -- | The complete Relation of a MultiDefn is defined as the quantity and the 
        
           --   related expressions being equal (e.g., `q $= a $= b $= ... $= z`) 
        
           instance Express e => Express (MultiDefn e) where 
        
             express q = equiv $ sy q : NE.toList (NE.map (express . (^. expr)) (q ^. rvs))

Drasil/code/drasil-theory/lib/Theory/Drasil/GenDefn.hs

Lines 40 to 41 in 2178e68

    
           -- | Converts the 'GenDefn's related expression into a 'ModelExpr'. 
        
           instance Express            GenDefn where express     = express . (^. mk)

etc. (unfortunately, there are very few typeclass examples that I can recall right now)

Chunk Instances

Chunks are instantiated using deeply embedded DSLs in Drasil:

Drasil/code/drasil-example/glassbr/lib/Drasil/GlassBR/Unitals.hs

Lines 74 to 83 in 2178e68

    
           plateLen = uqcND "plateLen" (nounPhraseSP "plate length (long dimension)") 
        
             lA metre Real  
        
             [ gtZeroConstr, 
        
               physc $ UpFrom (Inc, sy plateWidth), 
        
               sfwrc $ Bounded (Inc , sy dimMin) (Inc , sy dimMax)] (dbl 1.5) defaultUncrt 
        
           plateWidth = uqcND "plateWidth" (nounPhraseSP "plate width (short dimension)") 
        
             lB metre Real 
        
             [ physc $ Bounded (Exc, exactDbl 0) (Inc, sy plateLen), 
        
               sfwrc $ Bounded (Inc, sy dimMin) (Inc, sy dimMax)] (dbl 1.2) defaultUncrt

Drasil/code/drasil-example/glassbr/lib/Drasil/GlassBR/IMods.hs

Lines 41 to 53 in 2178e68

    
           lrIsSafe :: InstanceModel 
        
           lrIsSafe = imNoDeriv (equationalModelN (nounPhraseSP "Safety Req-LR") lrIsSafeQD) 
        
             [qwC lRe $ UpFrom (Exc, exactDbl 0), qwC demand $ UpFrom (Exc, exactDbl 0)] 
        
             (qw isSafeLR) [] 
        
             [dRef astm2009] "isSafeLR" 
        
             [lrIsSafeDesc, capRef, qRef] 
        
           lrIsSafeQD :: SimpleQDef  
        
           lrIsSafeQD = mkQuantDef isSafeLR (sy lRe $> sy demand) 
        
           iModDesc :: QuantityDict -> Sentence -> Sentence 
        
           iModDesc main s = foldlSent [S "If", ch main `sC` S "the glass is" +:+. 
        
               S "considered safe", s `S.are` S "either both True or both False"]

But then we need to manually gather all relevant chunks into a single database, for auditing and usage:

Drasil/code/drasil-example/glassbr/lib/Drasil/GlassBR/Body.hs

Lines 129 to 151 in 2178e68

    
           symbMap :: ChunkDB 
        
           symbMap = cdb thisSymbols (map nw acronyms ++ map nw thisSymbols ++ map nw con 
        
             ++ map nw con' ++ map nw terms ++ map nw doccon ++ map nw doccon' ++ map nw educon 
        
             ++ [nw sciCompS] ++ map nw compcon ++ map nw mathcon ++ map nw mathcon' 
        
             ++ map nw softwarecon ++ map nw terms ++ [nw lateralLoad, nw materialProprty] 
        
              ++ [nw distance, nw algorithm] ++ 
        
             map nw fundamentals ++ map nw derived ++ map nw physicalcon) 
        
             (map cw symb ++ terms ++ Doc.srsDomains) (map unitWrapper [metre, second, kilogram] 
        
             ++ map unitWrapper [pascal, newton]) GB.dataDefs iMods [] tMods concIns section 
        
             labCon [] 
        
           concIns :: [ConceptInstance] 
        
           concIns = assumptions ++ goals ++ likelyChgs ++ unlikelyChgs ++ funcReqs ++ nonfuncReqs 
        
           labCon :: [LabelledContent] 
        
           labCon = funcReqsTables ++ [demandVsSDFig, dimlessloadVsARFig] 
        
           usedDB :: ChunkDB 
        
           usedDB = cdb ([] :: [QuantityDict]) (map nw acronyms ++ map nw thisSymbols) 
        
            ([] :: [ConceptChunk]) ([] :: [UnitDefn]) [] [] [] [] [] [] [] ([] :: [Reference]) 
        
           refDB :: ReferenceDB 
        
           refDB = rdb citations concIns

And then we also need to manually define a “system” that we're currently interested in:

Drasil/code/drasil-example/glassbr/lib/Drasil/GlassBR/Body.hs

Lines 63 to 83 in 2178e68

    
           si :: SystemInformation 
        
           si = SI { 
        
             _sys         = glassBR, 
        
             _kind        = Doc.srs, 
        
             _authors     = [nikitha, spencerSmith], 
        
             _purpose     = [], 
        
             _background  = [], 
        
             _quants      = symbolsForTable, 
        
             _concepts    = [] :: [DefinedQuantityDict], 
        
             _instModels  = iMods, 
        
             _datadefs    = GB.dataDefs, 
        
             _configFiles = configFp, 
        
             _inputs      = inputs, 
        
             _outputs     = outputs, 
        
             _defSequence = qDefns, 
        
             _constraints = constrained, 
        
             _constants   = constants, 
        
             _sysinfodb   = symbMap, 
        
             _usedinfodb  = usedDB, 
        
              refdb       = refDB 
        
           }

Compilers

Note: the compiler does not take in external data. All “source code” inputs are contained within the Haskell binaries and source code.

Drasil/code/drasil-example/glassbr/app/Main.hs

Lines 8 to 15 in 2178e68

    
           main :: IO() 
        
           main = do 
        
             setLocaleEncoding utf8 
        
             typeCheckSI fullSI 
        
             gen (DocSpec (docChoices SRS [HTML, TeX, JSON]) "GlassBR_SRS") srs printSetting 
        
             genCode choices code 
        
             genDot fullSI 
        
             genLog fullSI printSetting

I think of these Main.hs files as the “whole” compilers and compiler-related components, but they are immediately composed of other ones. For example, this compiler is really just a designation for a bit of type checking, the grand “SmithEtAl” transformer, and a bit of analysis of the chunks that were gathered. Note also that this is intended to be run on the above gathered chunk instances, or on any pool of gathered chunk instances, up to the configuration of the compiler.

[Aside: note that a compiler is any coherent, meaningful, collection of transformers connected together to form some sort of product from some source input.]

A runtime

Finally, the most important part: the part where we run Drasil. We run Drasil by compiling the Haskell source code and then running the produced executables. Every time we run those executables, we get the exact same outputs, ignoring things inputted to the executable (e.g., the “current” time, etc.).

So, the compiled Haskell code is our runtime.

[ASIDE: This is similar to this original topic #1 idea where I talked about partial evaluation, the chunk db, the executables, and whatnot.]

Observations

So, what can we observe from these specific cases?

Individually, from the examples

Chunk Types

We have a common record syntax with very similar boilerplate spread around (P2). The typeclasses mirroring the functionality of an internally held is also widespread, but we have no way of automating it or designating without the excess Haskell cruft (P2.5). However, for other chunk types, such as Expr, ModelExpr, and CodeExpr, we have significant duplication on them (P3). We also have manually written “UIDs” for every chunk (P4), and we don't have any guarantee that they are unique (P5, but when #2873 is implemented, this issue becomes with respect to a particular ChunkDB). Chunk reference maps and “referenced by” maps are also something that we should be able to automate from the structure of the types (P6).

Interestingly, with both the types and the transformers, we're missing out on something important (and it's a bit more evident with Expr/ModelExpr/CodeExpr): theories (~ modules as records, typeclasses, etc. -- we will hopefully be talking more about this in the next in-person meeting).

Chunk Transformers

Our chunk transformers follow the same general scheme but are spread around everywhere. Analyzing what things we can do at any particular typed “hole” is difficult (P7) because it requires us to use our Haddock documentation and look for general functions. I would like us to be able to have options shown to us similar to agda-mode.

Chunk Instances

We build our chunk instances using DSLs and manually gather them for a ChunkDB. However, we've run into issues with conflicting UIDs.

Additionally, when chunks are built and type-check, we don't necessarily have a guarantee that they will actually work against the staged integrity checks we have (P8) until we actually run the whole program (which wastes our time).

Compilers

The “compiler” is really just a defined, meaningful gathering of transformers that can produce some sort of encoding of some set of artifacts that we can dump/print onto host computers. Right now, this isn't very clear in Drasil, and we aren't able to really analyze any compiler (though, we only have 1 really at the moment) to show what knowledge (chunk types, etc.) it actually depends on (P9).

Runtime

[ASIDE: This is similar to my original post where I analyzed the Haskell runtime.]

The runtime is a bit peculiar. It's what we use to look for information about Drasil's staged checks.

But what's really going on during the runtime? All chunk instances, types, etc. are held in the executable. How can we get that information about stages and constraint checks earlier?

Recall: partial evaluation.

W.r.t. Drasil, all data is static since there is no interactivity, input IO, no command line arguments, etc. In other words, GHC should, in theory, be able to residualize the “compilers” (as shown above) down to some code that (a) prints feedback to the console, and (b) dumps software artifacts to the host machine if everything went well in the staged checks and compilation. Note that the feedback and the software artifact representations should be fully computed in the binaries already because Drasil has no external inputs.

In other words, there is a sense of programming interactivity with the staged checks with the compiler that we are missing out on (P9), because it is completely deferred to the runtime of the Haskell binaries. How can we obtain that back and forth interactivity that we have with Haskell, with our Drasil programs?

Together

Looking at these together, what are we observing?

We are observing the mechanical cruft Drasil has (as well as a lack of some analysis and other IDE-like tooling). [This sentence is meant to be an aggregation of the earlier “P” problems.]

What are all of these a symptom of?

These are all a symptom of the fact that Drasil is deeply embedded in Haskell. Is Haskell the right syntax we should be using to convey our ideas? Similar to what past and current researchers have done, can we abstract over it?

Well, we can try to apply Haskell reflection, quotation, evaluation, and other (meta)programming concepts here, but we'll still be encoding something in Haskell which we won't really have much of an internal understanding of. How can we obtain that? That would be a step towards the Drasil in Drasil long-term goal as well.

Until here, many languages and toolkits before Drasil focused on abstracting over monotonous tasks. The work involved capturing, what they deemed as, the “important” bits and showing how you can mechanically have the other things. They captured the key ideas from a more general syntax and made a domain-specific one from it with their key ideas. Drasil is really the same idea (to me), whereby we have software artifacts, a flow that's generally well-understood, and a bunch of cruft around it that we don't want to deal with, so we have Drasil to help us improve that workflow of developing the software artifacts, and also let us improve the quality of the software artifacts while we do it.

Those 5 major ideas form a powerful ideology about software construction, capturing every conceivable facet of the design, development, contribution (i.e., requirements), maintenance, and design. [I wrote a lot about this in chapter 2 of my Master's thesis, so I will spare your time here.]

Problem Statement

Drasil abstracts over families of software by their related problems, allowing users to describe software problems and generate solutions to them. However, as Drasil is constructed, there is growing cruft in the implementation, data is internal, encoded knowledge is hard to analyze, and the Haskell syntax shows conflicts with what we want from Drasil (e.g., theories and theory combinators, better analysis tooling, interactivity with staging, and the likes). Can we create a better syntax, specialized tools, and an interpreter to relieve us from the issues?

Potential Research Work/Questions

What does Drasil really do? What is the important mathematics behind what makes Drasil work, and how does it work? Specifically, how can we improve the captured knowledge through encoding them as "theories" (as per @JacquesCarette's talk above)?
Is Haskell the 'right' language for Drasil? If so, how can we gain better analysis tools and get rid of some of the cruft of Haskell? If not, can we create a better syntax for the needs of Drasil (as a compiler-compiler) and an interactive interpreter that allows us to develop Drasil programs faster and with better analysis tooling?
If we go down the route of “no” to (2), then we also have:
1. Can we generate external compilers for subsets of the captured knowledge in Drasil? e.g., a specific compiler that only accepts problems broken up into the Smith et al. SRS form and builds any of the supported GOOL PLs?
2. Can we regenerate Drasil?
Can we build “chunk refactoring” tools for Drasil to help us make some monotonous changes to data with rippling effects? (this is really Topic #2: Case Study: Drasil for Understanding Chunk/KF Changes and Assisted Updates for Instances #3007)

5 replies

balacij Feb 9, 2023
Collaborator Author

I rushed quite a bit to write this out. So I might still have many 'jumps', and it might be assuming the reader has a considerable amount of knowledge about Drasil (at least more than the bar should be at).

balacij Feb 9, 2023
Collaborator Author

I wrote the post initially in VS Codium, but had word wrapping at the 80th character. If you preferred that copy (which looks a bit ugly on GitHub since the font isn't monospaced), you can either look at the original post of this comment, or I can post another copy 😄

balacij Feb 9, 2023
Collaborator Author

Please let me know what you think about it @JacquesCarette & @smiths (as well as everyone else of course 😄 )!

Any and all feedback would be appreciated. If the core idea is good, and worth pursuing, then I think we/I can work a bit more on refining this post.

balacij Feb 9, 2023
Collaborator Author

Regarding two comments from our meeting on Tuesday:

Despite me using the word "Drasil" in the problem statement and research work/questions, I'm referring to the ideas that make Drasil special (i.e., recognizing software families, generation, DSLs, etc.). I can try to reword these two sections without referring to Drasil, but I'm not quite sure how to refer to software that focuses on this line of thinking.
Regarding the novelty, I think there are a few things that are novel when put together (but I ask for feedback on this, it might also be too ambitious):
1. Compilers sharing ASTs and transformers in the same language,
2. Generating compilers and basic IDE tooling (I would be interested in generating an LSP plugin for any editor that supports the LSP),
3. ^ the compilers would be for relevant DSLs, so we would, hopefully, have very informative error messages that talk about the meaning of constructions rather than syntax of the procedures, and
4. "assisted chunk refactoring" (as much as I've discussed the above ideas, I haven't spoken about this one much, I think -- the jist of it is that if we have a chunk that is being changed somehow [e.g., structure, name, split, etc.], then we should have a means of analyzing exactly what other things it would break [i.e., the things that reference it], and where applicable, we should have a recipe that we should be able to apply to those referencing chunks so that we can "fix" their code automatically [i.e., avoiding changing the business logic/semantics of whatever those referencing chunks were using the changed chunk for]).

Regarding Drasil specifically, it would also be a step towards "Drasil in Drasil."

Also, I'm a bit hesitant to add a 5th point to (2), but we might also add "implementing an Alonzo-like language (from CAS 760) along with theory definitions, extensions, developments, and morphisms between them, to create a very different 'module' system than what most languages have." After our next meeting, it might be a bit more clear whether this makes sense to even add to Drasil.

balacij Feb 9, 2023
Collaborator Author

Regarding the problem statement, I was a bit hesitant to add one more part (which I mentioned in our last meeting): similar to how Drasil folks took manually created software artifacts, abstracted over their ideas, and created a generator for them and related software families, can we abstract over what that manually created generator does so that it can externalize its own components (e.g., generate the hand-crafted compiler that builds the SRS documents and code from a source language that "fills in" the SRS document)?

JacquesCarette · 2023-02-09T21:56:44Z

JacquesCarette
Feb 9, 2023
Maintainer

About the start of the long comment: This is a long description of the large distance between:

the language the computers speak
a human-understandable description of the task we want performed (by the computer).
and the process of abstraction that happens. layer by layer, to close the gap. You correctly describe the (many details) of doing this bottom-up, from existing languages.

In some sense, it is straightforward to abstract over assembly but, as you say, it gets harder and harder to abstract over higher level languages. This is where we need to be guided by the reason why we're abstracting, i.e. the eventual goal. That's where the top-down picture comes in. We want to look at the actual tasks we want the computer to perform; and we want to look at the highest-level description of those tasks, i.e. the words we use between humans to describe those tasks.

We need to add an additional idea (which is implicit here and in Drasil): that the collection of tasks that we wish to perform, in general, contains a huge amount of similarities. There might be tens of thousands of uses for computers, but these uses contain a lot of repetition of various pieces (call them features, components, whatever).

From Draco onwards, we know that "domain knowledge" is a key idea behind this. It sits somewhere close to the highest level of abstraction. But that kind of knowledge is too raw, it captures only some aspects of things, generally missing the "know-how" (also called "procedural knowledge" by philosophers and in cognitive science, but that's just very confusing for us, as we've overloaded 'procedural' a bit too much). Drasil then starts from the idea that a lot of what is traditionally written by hand by humans could in fact be the output of some other program.

We still have some non-trivial gaps:

what is the language that takes the raw knowledge and turns it into appropriate input for the generators? (We currently call this 'recipes', but we don't have a good language of recipes isolated
how do we adequately encode this "domain knowledge"? (We have a lot of different 'pieces' for this)

One of the things that is sub-optimal in the 5 major components description of Drasil is that it misses one level, and it's the one that sits underneath the chunks: the basic kinds of data that's actually gathered together into chunks. That is really where the analysis needs to start.

I do agree with the view that a very large part of our code base is 'transformers'. It is still useful to classify them using

transformers between what and what?
what additional information do specific transformers need to do their job?
what variations could the transformers offer?
what knowledge is currently embedded in the transformers themselves? [Sometimes this 'hard coding' is good/necessary/inevitable, sometimes it's bad.]

Chunk instances then occur because our 'chunks' are like types (for information) and then the instances are actual instances (i.e. like terms of a type). This is good. But we do need to revisit our construction methods to make this nicer.

The notion of 'run time' is actually quite a bit more complicated. There is the run-time of Drasil as a generator, as well as the run-time of the generated code (and the compile-time of the generated code in some cases too). Finding good words to describe that is hard. It's worth reading up on partial evaluation (especially the 'cogen' idea that comes out of the Futumura projections) to get a better handle on that.

At the end, there are some excellent questions, like "What does Drasil really do?". A really solid description of that is still missing. A proper answer is big and complicated, and needs to encompass all of Drasil.

Re: "Is Haskell the right language for Drasil?" To me the answer is a resounding 'yes' as well as a resounding 'no'! "Huh?" you might say. This is because, to me, Drasil is the name for a collection of languages and language processors. The first thing we need to have in hand is a proper name for all of the languages, and all of the processors. Then, for every single one, we can ask the question of what language should that be written in (i.e. what meta-language should we use for each).

Because Drasil is a piece of software that performs a task that we want performed, if we go back to the start of this (now long) comment, we see how Drasil itself can be seen as fitting in that gap, and how it could be the target of abstraction.

About the very last comment: "can we abstract over what that manually created generator does [...]". My answer: once we thoroughly understand what the domain of it is, what its input, output and process are, yes we can. We are several steps away from that thorough understanding. We have lots of simpler pieces that we still don't thoroughly understand.

All in all: great stuff.

2 replies

balacij Feb 10, 2023
Collaborator Author

Thank you very much! These are very interesting questions and points you bring up.

Re. non-trivial gaps:

To be clear, are you referring to the language of the transformers? "What language would the transformers use to transform chunks into other chunks?"
I will need to think about this more

Re. data 'below' chunks: I think the data below chunks would have to be “phenomenon,” left as just textual data, numerics, and booleans (or perhaps just booleans), with everything else being a chunk. Right now, we don't really think of Expr-like data types (i.e., the ADTs and GADTs) as chunks, but is there any reason not to?

Re. classifying transformers: I completely agree.

Re. Haskell: It's interesting because I've been very flip-floppy about Haskell for Drasil. Right now, I feel like we want a specific subset of Haskell, but with some changes to the module system. I think it would be really nice to have a language of theories (and a language similar to Alonzo from CAS 760) for building chunk types in Drasil.

Re. Cogen & Futumura Projections: This is a very interesting thread to read into. The Wikipedia entry discussing Futamura projections actually mentions 3 steps, which are somewhat similar to the 3 steps I had envisioned in my below comment as well, so it does sound like it's a very similar line of thinking 😄

In regard to this 'topic' in general, I think there is still tons of work to be done. Do you think this is something we could reasonably pursue?

Also, my response to yours is 'quick', I will definitely need to think about things slowly, and write more when I can.

JacquesCarette Feb 15, 2023
Maintainer

Yes, I am referring to language(s) of transformers, that transform information encoded in one way, to information encoded in another.

There are multiple levels to Drasil. Some of the levels match Haskell very well. It is a good match for writing interpreters for the "internal languages" of Drasil itself. The same way ML was designed as a good language for writing theorem provers. The requirements for the meta-language (here: Haskell) and the object languages (our recipes and transformers) are different.

balacij · 2023-02-09T22:50:56Z

balacij
Feb 9, 2023
Collaborator Author

Thank you @JacquesCarette 😄 I will respond shortly after I write this comment — I just need to get this thought out of my head and onto something before I forget.

On my drive home, I realized it might be good to write out what I think the work might look like in stages:

push the chunk instances out of Drasil: create parsers for the existing DSLs in Drasil and parse the data from external files and re-generate the existing stable/ subfolders,
push the chunk types and transformers out of Drasil: build a new DSL in Drasil for building chunk types and transformers, build a transformer for it that can create (Haskell-based) compilers, slowly move chunk types and transformers out of Drasil and generate their components,
describe what's left of Drasil externally: in the new DSL from (2), describe what we built manually in (2), and try to have (2) generate itself.

other tasks might include:

generating IDE tooling: after (2), this should be possible
building analysis tooling: after (1), we can do this for chunk instances, but after (2), we can do this for quite a bit more
assisted chunk refactoring: after (2), we can really start to explore this in-depth

Asides from steps (1) and (2), the order of the rest of the two lists doesn't matter much, I think.

Note: one super critical thing that I haven't really spoken about/fully understand well enough, which would greatly affect the above steps, is @JacquesCarette's question about "what sits beneath chunks."

I think these approximate steps are in tune with what you're mentioning, too, @JacquesCarette?

2 replies

JacquesCarette Feb 15, 2023
Maintainer

Hopefully yesterday's meeting answered "what sits beneath the chunks"?

balacij Jun 1, 2023
Collaborator Author

If I remember correctly, it was the things we can't quite realistically breach, it came down to glyphs with meaningful, conventional, understandings? If so, yes.

balacij · 2023-06-01T15:28:16Z

balacij
Jun 1, 2023
Collaborator Author

I have a bit of a mind dump regarding the past discussion about encoding transformers as typeclasses, I think there's a bit more nuance to it than I previously understood. The understanding previous was that since we mine "higher-level" encodings from the "lower-level" ones, that it should be a property of the higher-level ones that the lower-level ones can be recreated from it. I think that's a fair assessment, but I also claimed that we could capture all transformers using typeclasses.

For example, I thought we could use (approximately):

class Transform in out cfg where
    transform :: in -> cfg -> out

The in is the type of the input chunk, the out is the type of the output chunk, and cfg` is the type signature of the needed configuration knowledge packet to do the transformation. This should work in theory, but we clash with Haskell semantics when we have different kinds of transformers. Specifically, with regards to "orphaned instances of typeclasses."

Some transformers might be "properties" of the encodings, such as encodings be expressible in Expr, creating a system of equations, etc., while others are just plain ole "transformers" that show how some thing can be transformed into another thing. If we think of Expr, it might be a property of expressions that we can calculate complexity (of whatever kind) or that we can grab a list of dependencies (symbols) from it. In both cases, we would define an instance of the Transform typeclass somewhere near the definition of Expr, likely in the same module, with full access to the internal encoding of Expr. We might have mined (as we did previously), a structured copy of expressions that shows how a symbol has different, equivalent, formulas (i.e., EquationalRealms), with a bit more contextual natural language-based information. It should be a property of EquationalRealms that we can recreate the same expressions we mined them from, but it isn't necessarily a property of EquationalRealms that we can also display them in alternative/prettier formats than just flat, inlined, expressions.

So, we have two transformers we want to encode: (i - a property) recreate Expr expression, and (ii - a standard transformer) pretty print into our TeX document language. We should definitely define (i) in the same module as EquationalRealm, but defining (ii) anywhere near it is quite dubious (because why should a math library depend on a specific TeX document library?). Should we define (ii) near the definition of the TeX document language type? Well, this is also odd because it's not necessarily a property of them that we can transform EquationalRealms into it. We might have mined it from the documentation, but the knowledge was there before the documentation. So, we look towards another module (separate from both), connecting the 2 encodings. This makes sense, but alas, we Haskell greets us with an error message about orphaned instances (interestingly, the same wiki page discusses when orphaned instances can be useful -- exactly our case!).

Ok, now, am I thinking too much about this (note that I only thought about this a while ago, but I only typed this up now)? Is this just an oddity of Haskell? Is this a conflict between Haskell syntax and our desired syntax for building Drasil? Should all "transformers" have a "configuration" knowledge? In other words, should properties be captured in this same style as general transformers (maybe properties need to be captured differently)?

All this being said, I think it might be better to look at this through the lens of theories, theory extensions, and theory morphisms (amongst other kinds of theory-related concepts) and re-evaluate. I am going to start driving to McMaster now. See y'all soon!

1 reply

JacquesCarette Jun 14, 2023
Maintainer

Yes, this kind of overly-general typeclass (which is really an encoding of a 3-ary relation which we 'declare' to be a functional relation) is beyond what is comfortable to do in Haskell with classes.

But that's ok: I think having specific names for each 'transformer' is full of useful intensional information that we wouldn't get if they were all called 'transform'. So I'm quite fine not being polymorphic for these.

Topic #1: De-embedding Drasil #3003

balacij May 28, 2022 Collaborator

De-embedding Drasils Implementation

Problem Statement

Purpose Statement

Research Questions

Replies: 9 comments · 21 replies

smiths May 30, 2022 Collaborator

balacij May 30, 2022 Collaborator Author

smiths Jun 2, 2022 Collaborator

JacquesCarette May 31, 2022 Maintainer

balacij Jun 1, 2022 Collaborator Author

cd155 Jun 1, 2022 Collaborator

balacij Jun 1, 2022 Collaborator Author

peter-michalski Nov 18, 2022 Collaborator

balacij Nov 18, 2022 Collaborator Author

balacij Nov 18, 2022 Collaborator Author

balacij Nov 18, 2022 Collaborator Author

JacquesCarette Nov 24, 2022 Maintainer

JacquesCarette Nov 29, 2022 Maintainer

balacij Jan 17, 2023 Collaborator Author

balacij Feb 9, 2023 Collaborator Author

Procedure Capture

Abstraction over Procedural Cruft

Abstraction over Families of Problem Knowledge and Software Artifacts

Drasil

Examples

Chunk Types

Chunk Transformers

Chunk Instances

Compilers

A runtime

Observations

Individually, from the examples

Chunk Types

Chunk Transformers

Chunk Instances

Compilers

Runtime

Together

Problem Statement

Potential Research Work/Questions

balacij Feb 9, 2023 Collaborator Author

balacij Feb 9, 2023 Collaborator Author

balacij Feb 9, 2023 Collaborator Author

balacij Feb 9, 2023 Collaborator Author

balacij Feb 9, 2023 Collaborator Author

JacquesCarette Feb 9, 2023 Maintainer

balacij Feb 10, 2023 Collaborator Author

JacquesCarette Feb 15, 2023 Maintainer

balacij Feb 9, 2023 Collaborator Author

JacquesCarette Feb 15, 2023 Maintainer

balacij Jun 1, 2023 Collaborator Author

balacij Jun 1, 2023 Collaborator Author

JacquesCarette Jun 14, 2023 Maintainer

balacij
May 28, 2022
Collaborator

Replies: 9 comments 21 replies

smiths
May 30, 2022
Collaborator

balacij May 30, 2022
Collaborator Author

smiths Jun 2, 2022
Collaborator

JacquesCarette
May 31, 2022
Maintainer

balacij Jun 1, 2022
Collaborator Author

cd155
Jun 1, 2022
Collaborator

balacij Jun 1, 2022
Collaborator Author

peter-michalski
Nov 18, 2022
Collaborator

balacij Nov 18, 2022
Collaborator Author

balacij Nov 18, 2022
Collaborator Author

balacij Nov 18, 2022
Collaborator Author

JacquesCarette Nov 24, 2022
Maintainer

JacquesCarette Nov 29, 2022
Maintainer

balacij
Jan 17, 2023
Collaborator Author

balacij
Feb 9, 2023
Collaborator Author

balacij Feb 9, 2023
Collaborator Author

balacij Feb 9, 2023
Collaborator Author

balacij Feb 9, 2023
Collaborator Author

balacij Feb 9, 2023
Collaborator Author

balacij Feb 9, 2023
Collaborator Author

JacquesCarette
Feb 9, 2023
Maintainer

balacij Feb 10, 2023
Collaborator Author

JacquesCarette Feb 15, 2023
Maintainer

balacij
Feb 9, 2023
Collaborator Author

JacquesCarette Feb 15, 2023
Maintainer

balacij Jun 1, 2023
Collaborator Author

balacij
Jun 1, 2023
Collaborator Author

JacquesCarette Jun 14, 2023
Maintainer