Why do programming languages usually not implement number types with units?

elcaro · Sep 23, 2021

In fact, the only programming language I know of that implements something like units is the programming language Frink.

In other programming languages, the implementation is left to the programmer, for example C++ can implement this as a class.

jedishrfu · Sep 23, 2021

Yes that’s true. We often use variable naming conventions to convey the units used and then apply a math conversion to get other units of measure.

in a recent project, we were constantly converting English measure input to metric for modeling and then back to English measure for display.

There are some domain specific languages that do this in their class design so that you can add apples and oranges to get fruit.

z.meters = x.feet + y.yards

in one groovy language domain specific example, I saw pharmacists could use the language to do complicated drug formulations mixing and matching English measure with metric measure.

http://docs.groovy-lang.org/docs/latest/html/documentation/core-domain-specific-languages.html

pbuk · Sep 23, 2021

I disagree. When implementing physical models I have found that it is very important to used consistent units throughout the implementation.

pbuk · Sep 23, 2021

jedishrfu said:

in one groovy language domain specific example, I saw pharmacists could use the language to do complicated drug formulations mixing and matching English measure with metric measure.

This would be a good application for mixed units.

elcaro · Sep 23, 2021

pbuk said:

I disagree. When implementing physical models I have found that it is very important to used consistent units throughout the implementation.

What do you disagree with? Your reaction in fact supports the position mentioned in the OP, that one would need some (standard or user defined) implementation of units for specific programming cases, like physical models.

Frabjous · Sep 23, 2021

It’s computationally expensive. For example, if you are doing time stepping why do you want to carry that information through every step?

pbuk · Sep 23, 2021

elcaro said:

What do you disagree with?

I disagree that multiple units should be dealt with in the implementation of a physical model. An application that is capable of converting between different units is a completely different thing.

elcaro · Sep 23, 2021

caz said:

It’s computationally expensive. For example, if you are doing time stepping why do you want to carry that information?

Agree. But the normal number types (without unit information) can be used for such parts of the program. The use case for number types with unit information is for interaction with the user (input/output) and simple calculations. So you could have a program that internally uses meters for length units, while the user interface uses feet or miles.

pbuk · Sep 23, 2021

elcaro said:

Agree. But the normal number types (without unit information) can be used for such parts of the program. The use case for number types with unit information is for interaction with the user (input/output) and simple calculations. So you could have a program that internally uses meters for length units, while the user interface uses feet or miles.

Precisely. So we deal with the unit conversions in the presentation layer, not the impementation layer (and certainly not embedded in the language).

Frabjous · Sep 23, 2021

elcaro said:

Agree. But the normal number types (without unit information) can be used for such parts of the program. The use case for number types with unit information is for interaction with the user (input/output) and simple calculations. So you could have a program that internally uses meters for length units, while the user interface uses feet or miles.

You can then run into issues with significant digits and cutoffs. It’s better to write non-dimensional code where the user only needs to enter self-consistent data (which could be generated by a front end).

anorlunda · Sep 23, 2021

I think it could make a fun student project to implement classes for numbers with units.

The classes could compute the units of a result or do other dimensional analysis.

The classes could catch errors in units (such as adding quantities with different units.)

The classes might perform automatic unit conversion, or forbid unit conversion thus forcing consistent units.

A second run-time version of the classes for runtime could be used after debugging to drop the computationally expensive units and leave just numbers. Or the classes with units might be invoked only during source code editing, but not at runtime.

In other words, make it a feature of the editor, not the language.

I confess, the thought piqued my interest.

Jarvis323 · Sep 23, 2021

caz said:

It’s computationally expensive. For example, if you are doing time stepping why do you want to carry that information through every step?

Can be done at compile time.

Vanadium 50 · Sep 23, 2021

Is 2 degrees + 359 degrees 361 degrees? 1 degree? Something else?

Is 1 farad + 1 farad 2 farads? Or half a farad? Or something else?

Is red + green + blue black, like pigment or white, like light?

I think it's entirely reasonable to make these decisions when building classes and other data structures. I think trying to build this into intrinsic types will create more problems than it solves.

PeterDonis · Sep 23, 2021

elcaro said:

Summary:: For implementing let's say physical models it would be handy to not only know the numeric value of a variable, but also its units, have all kinds of conversions (fahrenheit to celcius, meters to feet, etc.) and addition/subtraction rules (can only add or subtract values with the same units).

This is a classic case of something that should be implemented as a library, not intrinsic to a programming language. Why? Because different applications will use very different unit systems. Libraries can easily address those different needs. Making units intrinsic to a language would make that harder, not easier.

jedishrfu · Sep 23, 2021

Using libraries is the most common way. However, a case could be made for subtyping floats and int as units of measure with the compiler validating and autoboxing as needed.

So many times in scientific applications maintained over a long term, the units of measure for data gets lost along with any documentation that explained it thoroughly. Programmers are left to wing it through testing and careful code review to insure that they haven’t affected the data values with their changes.

I’ve seen a few cases where conversions were applied then unapplied then reapplied as the data wended it’s way through the code. In one case, the conversion from English to meters was applied twice giving out some real confusing results. In another case, degrees were confused with radians which produced Spirograph display output.

Naming conventions for variables can sometimes help although @Mark44 can tell the fiasco of Symonyi variable naming notation at his company that drove people mad.

FactChecker · Sep 23, 2021

The program language can use consistent standard units that it works with internally and only needs to convert to other units when it is interfacing with the user or other programs. The MathCad computer software by Mathsoft does that. It is very useful and can assist in dimensional analysis in its operations. It was very helpful. I thought that was where software was headed decades ago.
Then along came Ada. Instead of being helpful, it just refused to compile and taunted you cruelly at every step. We called it "strict typing hell". The result was that unchecked conversion was used far too often just to get something to run. But it often did stop you from adding inches to feet.

Mark44 · Sep 23, 2021

pbuk said:

So we deal with the unit conversions in the presentation layer, not the impementation layer (and certainly not embedded in the language).

Strongly agree.

jedishrfu said:

However, a case could be made for subtyping floats and int as units of measure

Floating point numbers and integral numbers are already subtyped, but on the basis of ranges of values rather than units. For example, in C and C++, there are float, double, and long double for floating point types; there are short, int, long, and long long for integral types, with signed and unsigned versions of each.

jedishrfu said:

Naming conventions for variables can sometimes help although @Mark44 can tell the fiasco of Symonyi variable naming notation at his company that drove people mad.

AKA Hungarian notation (Charles Simonyi is Hungarian). This was more related to the type of the variable, using prepended "warts," and not so much on the units involved in numeric types. For example, Windows code has a lot of variables with names like pszNameStr, where the psz prefix means "pointer to a zero-terminated string."

Baluncore · Sep 24, 2021

I have written an overload for BASIC functions that tracks the full dimensional analysis through the computations. It correctly converts to and identifies all SI units and recognises named dimensions.

Such a dimensional analysis only needs to be done once at compile time. I do that by overloading before the run to check my program, or to identify problems.

Once you understand what you are doing, you do not need to use it very often as you perform the dimensional analysis in your subconscious while coding.

Rive · Sep 24, 2021

elcaro said:

Why do programming languages usually not implement number types with units?

Most programming languages are supposed to be kind of universal, and very few of them is so special that such extravagant functions would be addressed.

I would expect this kind/level of specialization to be possible part of specialized 'languages' dedicated to deal with math or physics problems.

Vanadium 50 · Sep 24, 2021

Mark44 said:

For example, in C and C++, there are float, double, and long double for floating point types; there are short, int, long, and long long for integral types, with signed and unsigned versions of each.

Yes, but the exact implementation is fairly loose. I believe the rule is "short can be no longer than long". (Slightly modified in C++11)

I was going to bring up Ada. Given its DoD roots, having to define a range of validity when declaring a variable makes sense. In the words of Bob Newhart "Men, I have just been notified that we will be surfacing in just a moment and you'll be happy to know that you will be gazing on the familiar skyline New York City..or possibly Buenos Aires"

I still think the proper place for this is in the classes, not the primitives. I also think doing this half- er...baked is not helpful. The Geant toolkit uses mm and MeV internally, and defines conversion constants so you can write energy=1.0*GeV and lenth=2.0*cm. This looks good, but length=2.0*GeV is perfectly acceptable (and is 2 meters). I would argue that should throw an error. It's not what the programmer intended.

(BTW, the Ada litany of errors? I would say "good". If the programmer's intent is ambiguous or inconsistent, the compiler should throw an error)

elcaro · Sep 24, 2021

Vanadium 50 said:

Is 2 degrees + 359 degrees 361 degrees? 1 degree? Something else?

Is 1 farad + 1 farad 2 farads? Or half a farad? Or something else?

Is red + green + blue black, like pigment or white, like light?

I think it's entirely reasonable to make these decisions when building classes and other data structures. I think trying to build this into intrinsic types will create more problems than it solves.

This is application defined. Different applications would need units differently.
For example adding degrees, sometimes you want the normal sum, other times just modulo 360 degrees.
Colours add differently for paint as for light, so you would need different addition rules. While still being able converting light colour to paint colour.
Etc.

FactChecker · Sep 24, 2021

Rive said:

Most programming languages are supposed to be kind of universal, and very few of them is so special that such extravagant functions would be addressed.

I think it is more accurate to say that most general-purpose languages are supposed to be universal. There are a great many special-purpose languages that make their targeted applications much easier to deal with. IMO, there are many more such special-purpose languages in use than there are commonly used general-purpose languages.

jbergman · Sep 25, 2021

elcaro said:

Summary:: For implementing let's say physical models it would be handy to not only know the numeric value of a variable, but also its units, have all kinds of conversions (fahrenheit to celcius, meters to feet, etc.) and addition/subtraction rules (can only add or subtract values with the same units).

In fact, the only programming language I know of that implements something like units is the programming language Frink.

In other programming languages, the implementation is left to the programmer, for example C++ can implement this as a class.

elcaro, you are correct that this feature should be added to more languages and is incredibly important for verification of correctness. Frankly, many of the replies in this thread are shocking and wrong. There are famous examples of systems failing because of the use of incompatible units.

F# also supports this feature very elegantly with Units of Measure.

The example at the above link illustrates how elegant this feature is.

Unit of Measure Example:

// Mass, grams.
[<Measure>] type g
// Mass, kilograms.
[<Measure>] type kg
// Weight, pounds.
[<Measure>] type lb

// Distance, meters.
[<Measure>] type m
// Distance, cm
[<Measure>] type cm

// Distance, inches.
[<Measure>] type inch
// Distance, feet
[<Measure>] type ft

// Time, seconds.
[<Measure>] type s

// Force, Newtons.
[<Measure>] type N = kg m / s^2

// Pressure, bar.
[<Measure>] type bar
// Pressure, Pascals
[<Measure>] type Pa = N / m^2

// Volume, milliliters.
[<Measure>] type ml
// Volume, liters.
[<Measure>] type L

// Define conversion constants.
let gramsPerKilogram : float<g kg^-1> = 1000.0<g/kg>
let cmPerMeter : float<cm/m> = 100.0<cm/m>
let cmPerInch : float<cm/inch> = 2.54<cm/inch>

let mlPerCubicCentimeter : float<ml/cm^3> = 1.0<ml/cm^3>
let mlPerLiter : float<ml/L> = 1000.0<ml/L>

// Define conversion functions.
let convertGramsToKilograms (x : float<g>) = x / gramsPerKilogram
let convertCentimetersToInches (x : float<cm>) = x / cmPerInch

[<Measure>] type degC // temperature, Celsius/Centigrade
[<Measure>] type degF // temperature, Fahrenheit

let convertCtoF ( temp : float<degC> ) = 9.0<degF> / 5.0<degC> * temp + 32.0<degF>
let convertFtoC ( temp: float<degF> ) = 5.0<degC> / 9.0<degF> * ( temp - 32.0<degF>)

// Define conversion functions from dimensionless floating point values.
let degreesFahrenheit temp = temp * 1.0<degF>
let degreesCelsius temp = temp * 1.0<degC>

printfn "Enter a temperature in degrees Fahrenheit."
let input = System.Console.ReadLine()
let parsedOk, floatValue = System.Double.TryParse(input)
if parsedOk
   then
      printfn "That temperature in Celsius is %8.2f degrees C." ((convertFtoC (degreesFahrenheit floatValue))/(1.0<degC>))
   else
      printfn "Error parsing input."

The reason why this is not implemented is that technically it is very difficult to do.

FactChecker · Sep 25, 2021

There are two levels of implementing units to think about in a computer language:
1) Simple conversion of units to standard units can be done in the external interfaces. That allows the programming language to be free of worrying about units.
2) Dimensional analysis like MathCad computer software by Mathsoft does (and apparently F# that @jbergman mentioned) is much more ambitious. It forbids calculations that do not have the right dimensions. You can not add x-miles to y-miles/sec. It requires the programming language, or at least the compiler, to have knowledge of the units.

PeterDonis · Sep 25, 2021

jbergman said:

There are famous examples of systems failing because of the use of incompatible units.

This fact does not necessarily imply that the best way to fix the problem is to build units directly into programming languages, instead of using libraries to handle units and unit conversions. The fact that not everyone agrees with your opinion on this does not mean everyone else's posts are "shocking and wrong".

jbergman · Sep 25, 2021

PeterDonis said:

This fact does not necessarily imply that the best way to fix the problem is to build units directly into programming languages, instead of using libraries to handle units and unit conversions. The fact that not everyone agrees with your opinion on this does not mean everyone else's posts are "shocking and wrong".

Type safety is always preferable for critical systems. A programmer can make a human error in a conversion library. With units of measure that code won't even compile.

Now in some cases there might be tradeoffs that force one to abandon such an approach, i.e., critical performance constraints but as a general rule one should prefer type safety.

jbergman · Sep 25, 2021

I should mention that Haskell also supports this feature with the units package, although I am less familiar with it.

https://hackage.haskell.org/package/units

PeterDonis · Sep 25, 2021

jbergman said:

Type safety is always preferable for critical systems.

Perhaps. But the vast majority of systems are not critical systems.

jbergman said:

as a general rule one should prefer type safety.

I don't think this "general rule" is by any means universally accepted.

Vanadium 50 · Sep 25, 2021

PeterDonis said:

This fact does not necessarily imply that the best way to fix the problem is to build units directly into programming languages

I agree with this.

First, it's not entirely clear what is being proposed. If it is that "length in meters" is an internal type and "length in centimeters" is a different internal type such that they cannot be added without explicit conversion, that means that the only units that can ever be used are the ones built-in to the language.

Now, if one says, "no, this can be extended in the language to go beyond these intrinsic types", well, we're there now. I can do this in C++ today, where length_in_meters is an instance of the length class, which has two members: the value, and the units. (And if you like, length and area are derived classes from a base class)

FactChecker · Sep 26, 2021

PeterDonis said:

This fact does not necessarily imply that the best way to fix the problem is to build units directly into programming languages, instead of using libraries to handle units and unit conversions. The fact that not everyone agrees with your opinion on this does not mean everyone else's posts are "shocking and wrong".

I think that you are underestimating the benefit of a compiler that can detect a mismatch of units and dimensions in an equation. I have seen the benefit when I used Mathcad and it sounds like F# has the same capability although I am not familiar with F#. IMO, that aspect has not been fully appreciated in many of the posts.

PeterDonis · Sep 26, 2021

FactChecker said:

I think that you are underestimating the benefit of a compiler that can detect a mismatch of units and dimensions in an equation.

Compilers can do that with appropriate class definitions as well as with built-in language features.

Baluncore · Sep 26, 2021

FactChecker said:

I think that you are underestimating the benefit of a compiler that can detect a mismatch of units and dimensions in an equation.

I once thought that dimension checking would be valuable, so as a challenge I wrote the code to do it really well. It was great fun to write, and quite educational.
But now I find that after all, I never really needed it.

One of the challenges was using signed integers for dimensions, then deciding how to handle the square root of kg/m³. I found a very simple solution which was fun to implement.

All computations should be in SI units. Then all unit conversions are to or from SI. If I get an equation wrong, then the numbers will not pass the testing, the same thing happens if I get a conversion factor wrong, or I get my loops inside out.

The units you use will be decided by your data source and destination. How can those imports be forced to use the same convention as your compiler. All you can do is verify that the input data and results fall in reasonable ranges.

Dimensional analysis should be done before you write any code. It is hard enough getting to know and trust a good compiler without burdening it with doing your due diligence for you. There will still be many coding mistakes for you to make, things that could never be detected by dimensional analysis.

The only time dimensional verification or tracking might be useful would be in a once-off numerical calculator like Mathcad. Then you could enter the dimensional units as well as the numbers, and so check the dimensions of your equations before writing your code to be compiled for speed, free of all run-time dimensional analysis.

Jarvis323 · Sep 26, 2021

Vanadium 50 said:

I agree with this.

First, it's not entirely clear what is being proposed. If it is that "length in meters" is an internal type and "length in centimeters" is a different internal type such that they cannot be added without explicit conversion, that means that the only units that can ever be used are the ones built-in to the language.

Now, if one says, "no, this can be extended in the language to go beyond these intrinsic types", well, we're there now. I can do this in C++ today, where length_in_meters is an instance of the length class, which has two members: the value, and the units. (And if you like, length and area are derived classes from a base class)

There is actually a boost library to do this. Maybe it will even one day be part of the standard library. To do it all at compile time and efficiently they use template metaprogramming of course. The downside is that template metaprogramming is enormously complex.

The Boost.Units library is a C++ implementation of dimensional analysis in a general and extensible manner, treating it as a generic compile-time metaprogramming problem. With appropriate compiler optimization, no runtime execution cost is introduced, facilitating the use of this library to provide dimension checking in performance-critical code. Support for units and quantities (defined as a unit and associated value) for arbitrary unit system models and arbitrary value types is provided, as is a fine-grained general facility for unit conversions. Complete SI and CGS unit systems are provided, along with systems for angles measured in degrees, radians, gradians, and revolutions and systems for temperatures measured in Kelvin, degrees Celsius and degrees Fahrenheit. The library architecture has been designed with flexibility and extensibility in mind; demonstrations of the ease of adding new units and unit conversions are provided in the examples.

In order to enable complex compile-time dimensional analysis calculations with no runtime overhead, Boost.Units relies heavily on the Boost Metaprogramming Library(MPL) and on template metaprogramming techniques, and is, as a consequence, fairly demanding of compiler compliance to ISO standards.

https://www.boost.org/doc/libs/1_74_0/doc/html/boost_units.html

It might be that if you incorporated it into the language itself, then you could achieve a cleaner and easier to use design. Probably much much less than 1% of scientisits are experts in C++, probably about 1% of C++ programmers are experts in template metaprogramming, and even fewer are experts in the Boost Metaprogramming Library.

Where I imagine it could shine would be in a high level domain specific language for scientific programming. Technically, incorporating units makes the language more expressive. And the compiler will know more your intent, and can thus do more in terms of optimization as well as in generating more helpful warnings and error messages. Of course you have to make such a compiler.

You could use Boost.Units as a backend for the implementetion of a simpler language.

elcaro · Sep 26, 2021

jbergman said:

Type safety is always preferable for critical systems. A programmer can make a human error in a conversion library. With units of measure that code won't even compile.

Now in some cases there might be tradeoffs that force one to abandon such an approach, i.e., critical performance constraints but as a general rule one should prefer type safety.

Perhaps these cases could be handeld by performance optimization, that can be done automatically as last step without loosing the constraints on units enforced by the compiler.

elcaro · Sep 26, 2021

Vanadium 50 said:

I agree with this.

First, it's not entirely clear what is being proposed. If it is that "length in meters" is an internal type and "length in centimeters" is a different internal type such that they cannot be added without explicit conversion, that means that the only units that can ever be used are the ones built-in to the language.

Now, if one says, "no, this can be extended in the language to go beyond these intrinsic types", well, we're there now. I can do this in C++ today, where length_in_meters is an instance of the length class, which has two members: the value, and the units. (And if you like, length and area are derived classes from a base class)

Both meters and centimeters are lengths, and both are of similar type which you can add, just that you need a conversion which can be done automatically. Adding lenghts and - let's say time, would however raise a compilation error. Multiplying or dividing lenghts and time would be ok though, creating a new type.

Why do programming languages usually not implement number types with units?

Similar threads

Hot Threads

Recent Insights