We can’t use the same field name for different records in the same module, which is quite annoying in standard Haskell. However, the DuplicateRecordFields extension is a valuable quickfix most of the time.

In this tutorial we take a detailed look at:

  1. What we mean with using “the field name for different records” and why the compiler rejects such code.
  2. How to get that code to compile with the DuplicateRecordFields extension.
  3. How to import and use field names in other modules (avoiding overloading).
  4. How to use overloaded field names in a module.

The Duplicate Record Field Problem

Quite often we define record data types that share the same field name:

-- MyModule.hs file
module MyModule where
  
data Person = MkPerson { name :: String, age :: Int }
data Animal = MkAnimal { name :: String }
module
Haskell keyword to signal that the rest of the file defines Haskell code. A module must be named the same as its file name, and allows us to hide implementation details just as Java classes do with public/private keyword. ➡ Official tutorial
data Person = MkPerson { name :: String, age :: Int }
New custom data type definition. The left Person is the type name, the right MkPerson is the (value) constructor to build actual values of that type. Type and constructor name are usually the same to avoid having to come up with different names. In OO languages like Java this would be a class, in Typescript an object type definition, and in Rust/C++ a struct. We would then use the constructor to create a real value (called instance or object in OO languages) of that type. ➡ in-depth explanation
The similarity, especially to structs, is easier to see when we use another formatting:
-- type/"class" definition
data Person = MkPerson { -- type and constructor
  name :: String,        -- field 'name' of type string
  age :: Int             -- field 'age' of type int
}

-- value/"instance"
myPerson = MkPerson { name = "John", age = 21}
::
Interpret as “has type”.

If we try to compile that module, we get an error indicating a name conflict:

error: 
  Multiple declarations of ‘name’
  Declared at: /..MyModule.hs:3:24 
               /..MyModule.hs:4:24

The problem is that for records the Haskell compiler generates “field selector functions” which most object-oriented programmers would rather call “accessor functions” or “getters” for an attribute in languages like Java or Python:

..
data Animal = MkAnimal { name :: String }
..
pet = MkAnimal { name = "Nemo" } -- name used as constructor argument
..
petName = name pet               -- generated name getter

Here the generated getter function has the following type:

name :: Animal -> String

Give it an Animal and it will extract the corresponding field/attribute value.
See: Field selector functions

If we use the same field name multiple times in different records within the same module, Haskell generates two top-level function with the same name:

-- MyModule.hs file
module MyModule where
  
data Person = MkPerson { name :: String, age :: Int }
data Animal = MkAnimal { name :: String }

-- generated
name :: Person -> String
..
name :: Animal -> String
..

And that is forbidden in Haskell (as in most other programming languages).


A common but annoying workaround is to add the type name to the fields to make them unique:

-- MyModule.hs file
module MyModule where
  
data Person = MkPerson { personName :: String, personAge :: Int }
data Animal = MkAnimal { animalName :: String }

However, there are more convenient workarounds like the following.

Compile Type Defitions With DuplicateRecordFields

We can simply enable the DuplicateRecordFields Haskell language extension:

-- MyModule.hs file
{-# LANGUAGE DuplicateRecordFields #-}  -- <---- only change

module MyModule where
  
data Person = MkPerson { name :: String, age :: Int }
data Animal = MkAnimal { name :: String }

-- no further use of problematic 'name' getter in this module
{-# LANGUAGE .. #-}
Syntax (called a “pragma” or “compiler directive”) at the very top of our file to modify the Haskell compiler to enable language extensions for this file/module only. ➡ Other pragma examples
data Person = MkPerson { name :: String, age :: Int }
Custom data type definition. Person is the type, and MkPerson is the (value) constructor to build values of that type. Type and constructor names are frequently the same to avoid having to invent new names. ➡ in-depth explanation
::
Read it as “has type”.

If only define our data types but don’t use the problematic name function within the same module, this module compiles and we are done. Because this name can now belong to the Person or Animal type, it’s “overloaded”.

Use Problematic Fields In Other Modules

Another module can import and use our data type definitions with overloaded field name if it only imports one the name getters:

-- OtherModule.hs file
module OtherModule where
  
import MyModule (Person(name,age)) -- no 'Animal' type import

myPerson :: Person
myPerson = MkPerson {name = "Doe", age = 21}

-- ✅ 'name' in function with unspecified argument 'p'.
-- No error, because 'name' is unambiguous in this module.
myFunction p = name p 

Then within this other module, the function name isn’t overloaded, so there’s no problem.
See: Haskell import Wiki

If this other module needs both name getters (for Person and Animal), then we import one or both types qualified:

-- OtherModule.hs file
module OtherModule where
  
import MyModule (Animal(name,age)) -- no 'Person' type import
import qualified MyModule as M (Person(name,age)) -- only 'Person' type import

myPerson :: M.Person
myPerson = M.MkPerson {name = "Doe", age = 21}

-- ✅ 'name' in function with unspecified argument 'p'.
-- No error, because 'name' with qualifier 'M.' clearly belongs to 
-- the 'Person' import.
myFunction p = M.name p 

Then name isn’t overloaded, because here it’s working on Animal, and M.name works on Person. If we don’t use qualified imports but still need both name getters, we have again an overloaded field, and have to use explicit type annotations as described in the following.

Use An Overloaded Field In The Same Module

When we have a field like name that is overloaded in a module, we have to add type annotations at the right place to make it obviously clear to the compiler which name (either for Person or Animal) we mean. Here are usage examples on what works and what doesn’t:

-- MyModule.hs file
{-# LANGUAGE DuplicateRecordFields #-}

module MyModule where
  
data Person = MkPerson { name :: String, age :: Int }
data Animal = MkAnimal { name :: String }

-- Usage of overloaded 'name':

-- ✅ 'name' as constructor argument. 
-- 'name' type operating on 'Person' type is obvious from 'MkPerson'.
myPerson :: Person
myPerson = MkPerson { name = "John", age = 18 }


-- ❌ 'name' as setter. 
-- Quirk: Error even though 'myPerson' type is obvious from above.
otherPerson = myPerson { name = "Wick" }
-- error: 
--  • Record update is ambiguous, and requires a type signature  
--  • In the expression: myPerson {name = "Wick"}

-- ✅ 'name' as setter with explicit input record type.
otherPerson' = (myPerson :: Person) { name = "Wick" }

-- ✅ 'name' as setter with explicit result record type.
-- Result type ' otherPerson'' ' is defined as 'Person', so 'name'
-- must be operating on a 'Person' type, not 'Animal'.
otherPerson'' :: Person
otherPerson'' = myPerson { name = "Wick" }


-- ❌ 'name' in function with unspecified argument 'p'.
-- 'myFunction' could be operating on a 'Person' or 'Animal', so type unclear.
myFunction p = name p 
-- error: 
--  • Ambiguous occurrence ‘name’  
--  • It could refer to 
--     either the field ‘name’, 
--        defined at MyModule.hs:6:26 
--      or the field 'name', 
--        defined at MyModule.hs:5:26


-- ❌ 'name' in function with explicit type 'Person' for 'p'.
-- Quirk: Compiler doesn't "infer the type of the argument 
-- to determine the datatype". 
-- So, from knowing that 'p' has type 'Person' the compiler 
-- isn't able to conclude that 'name' must be operating on 'Person' type.
myFunction' :: Person -> String
myFunction' p = name p 
-- error: 
--  • Ambiguous occurrence ‘name’ ...

-- Same here. Making 'p' type even more explicit doesn't help.
myFunction'' p = name (p :: Person)
-- error: 
--  • Ambiguous occurrence ‘name’ ...


-- ✅ 'name' in function with explicit type for 'name'.
otherFunction p = (name :: Person -> String) p

-- ✅ 'name' in function with indirect "complete type" for 'name'.
-- Complete type 'Person -> String' of 'name' can concluded at once from 
-- helper function 'id' here.
otherFunction' p = (id name) p
  where id :: (Person -> String) -> (Person -> String)
        id f = f

So on one hand, using DuplicateRecordFields helps to compile our type definitions, which should be enough most of the time. On the other hand, using an overloaded getter like name still needs an explicit, lengthy type annotation like (name :: Person -> String). Fortunately, this last annoyance can be overcome by using a lens library like Optics.
See: DuplicateRecordFields extension documentation
See: Optics lenses package

Tags: haskell programming duplicate field name language extension beginner

Malte Neuss

Java Software Engineer by day, Haskell enthusiast by night.