Documentation
¶
Overview ¶
Package gedcom provides functions to parse and produce GEDCOM files.
GEDCOM (Genealogical Data Communication) is a standard format used for exchanging genealogical data between software applications. This package includes functionality for both parsing existing GEDCOM files and generating new ones.
Decoding GEDCOM Files ¶
The package provides a streaming Decoder for reading GEDCOM files. Use NewDecoder to create a decoder that reads from an io.Reader:
data, err := os.ReadFile("family.ged")
if err != nil {
log.Fatal(err)
}
d := gedcom.NewDecoder(bytes.NewReader(data))
g, err := d.Decode()
if err != nil {
log.Fatal(err)
}
for _, ind := range g.Individual {
if len(ind.Name) > 0 {
fmt.Println(ind.Name[0].Name)
}
}
The decoder is streaming and can handle large files without loading the entire contents into memory.
Encoding GEDCOM Files ¶
The package also provides an Encoder for generating GEDCOM files. Use NewEncoder to create an encoder that writes to an io.Writer:
g := &gedcom.Gedcom{
Header: &gedcom.Header{
SourceSystem: gedcom.SystemRecord{
Xref: "MyApp",
ProductName: "My Application",
},
CharacterSet: "UTF-8",
},
Individual: []*gedcom.IndividualRecord{
{
Xref: "I1",
Name: []*gedcom.NameRecord{
{Name: "John /Doe/"},
},
Sex: "M",
},
},
Trailer: &gedcom.Trailer{},
}
f, err := os.Create("output.ged")
if err != nil {
log.Fatal(err)
}
defer f.Close()
enc := gedcom.NewEncoder(f)
if err := enc.Encode(g); err != nil {
log.Fatal(err)
}
Data Model ¶
The Gedcom struct is the top-level container returned by the decoder and accepted by the encoder. It contains slices of records for individuals, families, sources, and other GEDCOM record types.
IndividualRecord represents a person and contains their names, sex, life events (birth, death, etc.), family links, and citations.
FamilyRecord represents a family unit and links to husband, wife, and children as IndividualRecord pointers.
EventRecord is a flexible type used for both events (birth, death, marriage) and attributes (occupation, residence). The Tag field indicates the event type.
SourceRecord and CitationRecord handle source citations for genealogical claims.
Name Parsing ¶
The SplitPersonalName helper function parses GEDCOM-formatted names:
parsed := gedcom.SplitPersonalName("John \"Jack\" /Smith/ Jr.")
// parsed.Given = "John"
// parsed.Nickname = "Jack"
// parsed.Surname = "Smith"
// parsed.Suffix = "Jr."
User-Defined Tags ¶
GEDCOM allows custom tags prefixed with an underscore. These are captured in UserDefinedTag slices on most record types, preserving vendor-specific extensions.
Specification Coverage ¶
This package implements approximately 80% of the GEDCOM 5.5 specification, which is sufficient for parsing about 99% of real-world GEDCOM files. It has not been extensively tested with non-ASCII character sets.
Index ¶
- type AddressDetail
- type AddressRecord
- type AssociationRecord
- type ChangeRecord
- type CitationRecord
- type DataRecord
- type Decoder
- type Encoder
- type EventRecord
- type FamilyLinkRecord
- type FamilyRecord
- type FileRecord
- type Gedcom
- type Header
- type IndividualRecord
- type Line
- type MediaRecord
- type NameRecord
- type NoteRecord
- type ParsedName
- type PlaceRecord
- type RepositoryRecord
- type ScanErr
- type Scanner
- type SourceCallNumberRecord
- type SourceDataRecord
- type SourceEventRecord
- type SourceRecord
- type SourceRepositoryRecord
- type SubmissionRecord
- type SubmitterRecord
- type SystemRecord
- type Trailer
- type UserDefinedTag
- type UserReferenceRecord
- type VariantNameRecord
- type VariantPlaceNameRecord
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AddressDetail ¶ added in v0.1.0
type AddressRecord ¶
type AddressRecord struct {
Address []*AddressDetail
Phone []string
Email []string // 5.5.1
Fax []string // 5.5.1
WWW []string // 5.5.1
}
See https://www.tamurajones.net/GEDCOMADDR.xhtml for very informative analysis of the ADDR structure
type AssociationRecord ¶ added in v0.0.4
type AssociationRecord struct {
Xref string // Cross-reference to the associated individual
Relation string // Relationship type (e.g., "godparent", "witness")
Citation []*CitationRecord // Source citations
Note []*NoteRecord // Notes about this association
}
AssociationRecord links an individual to another person with a defined relationship, such as godparent, witness, or friend.
type ChangeRecord ¶ added in v0.0.4
type ChangeRecord struct {
Date string // Date of the change in GEDCOM date format
Time string // Time of the change
Note []*NoteRecord // Notes about the change
}
ChangeRecord indicates when a record was last modified.
type CitationRecord ¶
type CitationRecord struct {
Source *SourceRecord // The source being cited
Page string // Page number or location within the source
Data DataRecord // Data extracted from the source
Quay string // Quality assessment (0-3, with 3 being direct evidence)
Media []*MediaRecord // Media objects (e.g., photo of the source page)
Note []*NoteRecord // Notes about this citation
UserDefined []UserDefinedTag // User-defined tags
}
CitationRecord represents a citation to a source for a specific claim. It links to a SourceRecord and provides details about where in the source the information was found.
type DataRecord ¶
type DataRecord struct {
Date string // Date the data was recorded
Text []string // Verbatim text from the source
UserDefined []UserDefinedTag // User-defined tags
}
DataRecord contains data extracted from a source citation.
type Decoder ¶
type Decoder struct {
// contains filtered or unexported fields
}
A Decoder reads and decodes GEDCOM objects from an input stream.
func NewDecoder ¶
NewDecoder returns a new decoder that reads r.
func (*Decoder) Decode ¶
Decode reads GEDCOM-encoded data from its input and parses it into a Gedcom structure.
func (*Decoder) LogUnhandledTags ¶ added in v0.2.0
LogUnhandledTags configures the decoder to log any unrecognized GEDCOM tags to the provided writer. This is useful for debugging GEDCOM files that contain non-standard or vendor-specific tags.
type Encoder ¶ added in v0.2.13
type Encoder struct {
// contains filtered or unexported fields
}
Encoder writes GEDCOM-encoded data to an output stream. Use NewEncoder to create an Encoder and Encoder.Encode to write a Gedcom structure.
The encoder handles GEDCOM line length limits automatically, using CONT (continuation) and CONC (concatenation) tags to split long text.
func NewEncoder ¶ added in v0.2.13
NewEncoder returns a new encoder that writes to w.
type EventRecord ¶
type EventRecord struct {
Tag string // Event type tag (e.g., "BIRT", "DEAT", "MARR")
Value string // Event value, often "Y" to indicate event occurred
Type string // Detailed event type for generic EVEN tags
Date string // Date in GEDCOM date format
Place PlaceRecord // Location where the event occurred
Address AddressRecord // Address associated with the event
Age string // Age of the individual at the time of the event
ResponsibleAgency string // Agency responsible for the record
ReligiousAffiliation string // Religious affiliation associated with event
Cause string // Cause (e.g., cause of death)
RestrictionNotice string // Privacy restriction (GEDCOM 5.5.1)
ChildInFamily *FamilyRecord // Link to parent family for birth events
AdoptedByParent string // For adoption: "HUSB", "WIFE", or "BOTH"
Citation []*CitationRecord // Source citations for this event
Media []*MediaRecord // Media objects (e.g., photos, certificates)
Note []*NoteRecord // Notes about this event
UserDefined []UserDefinedTag // User-defined tags
}
EventRecord represents an event or attribute in a person's or family's life. Common event tags include BIRT (birth), DEAT (death), MARR (marriage), BURI (burial), CHR (christening), DIV (divorce). Common attribute tags include OCCU (occupation), RESI (residence), EDUC (education), RELI (religion).
type FamilyLinkRecord ¶
type FamilyLinkRecord struct {
Family *FamilyRecord // Pointer to the linked family
Type string // Relationship type (e.g., "birth", "adopted", "foster")
Note []*NoteRecord // Notes about this family link
}
FamilyLinkRecord links an individual to a family. It is used both for linking children to their parents (via [IndividualRecord.Parents]) and for linking spouses to their families (via [IndividualRecord.Family]).
type FamilyRecord ¶
type FamilyRecord struct {
Xref string // Unique cross-reference identifier for this family (e.g., "@F1@")
Husband *IndividualRecord // Pointer to the husband/father individual
Wife *IndividualRecord // Pointer to the wife/mother individual
Child []*IndividualRecord // Pointers to child individuals
Event []*EventRecord // Family events (marriage, divorce, census, etc.)
NumberOfChildren string // Total number of children, may differ from len(Child)
UserReference []*UserReferenceRecord // User-provided reference numbers
AutomatedRecordId string // Unique record ID assigned by the source system
Change ChangeRecord // Record of when this record was last modified
Note []*NoteRecord // Notes attached to this family
Citation []*CitationRecord // Source citations for this family
Media []*MediaRecord // Media objects (photos, documents) for this family
UserDefined []UserDefinedTag // User-defined tags (prefixed with underscore)
}
FamilyRecord represents a family unit in GEDCOM. It links individuals together as husband, wife, and children, and contains family events such as marriage, divorce, and census records.
type FileRecord ¶ added in v0.0.4
type FileRecord struct {
Name string // File path or URL
Format string // File format (e.g., "jpeg", "gif", "pdf")
FormatType string // Media type (e.g., "photo", "document")
Title string // Title or caption for this file
UserDefined []UserDefinedTag // User-defined tags
}
FileRecord contains information about a multimedia file.
type Gedcom ¶
type Gedcom struct {
Header *Header
Family []*FamilyRecord
Individual []*IndividualRecord
Media []*MediaRecord
Repository []*RepositoryRecord
Source []*SourceRecord
Submitter []*SubmitterRecord
Trailer *Trailer
UserDefined []UserDefinedTag
}
Gedcom represents a complete GEDCOM file. It is the top-level container returned by Decoder.Decode and accepted by Encoder.Encode.
type Header ¶
type Header struct {
SourceSystem SystemRecord
Destination string
Date string
Time string
Submitter *SubmitterRecord
Submission *SubmissionRecord
Filename string
Copyright string
Version string
Form string
CharacterSet string
CharacterSetVersion string
Language string
Place PlaceRecord
Note string
UserDefined []UserDefinedTag
}
A Header contains information about the GEDCOM file.
type IndividualRecord ¶
type IndividualRecord struct {
Xref string // Unique cross-reference identifier (e.g., "@I1@")
Name []*NameRecord // Names (may have multiple for maiden names, aliases)
Sex string // Sex: "M" for male, "F" for female, "U" for unknown
Event []*EventRecord // Life events (birth, death, burial, etc.)
Attribute []*EventRecord // Attributes (occupation, residence, education, etc.)
Parents []*FamilyLinkRecord // Links to families where this person is a child
Family []*FamilyLinkRecord // Links to families where this person is a spouse
Submitter []*SubmitterRecord // Submitters of this record
Association []*AssociationRecord // Associations with other individuals
PermanentRecordFileNumber string // Permanent record file number
AncestralFileNumber string // Ancestral file number
UserReference []*UserReferenceRecord // User-provided reference numbers
AutomatedRecordId string // Unique record ID assigned by the source system
Change ChangeRecord // Record of when this record was last modified
Note []*NoteRecord // Notes attached to this individual
Citation []*CitationRecord // Source citations for this individual
Media []*MediaRecord // Media objects (photos, documents)
UserDefined []UserDefinedTag // User-defined tags (prefixed with underscore)
}
IndividualRecord represents a person in GEDCOM. It contains the person's names, sex, life events, attributes, and links to their families.
type Line ¶ added in v0.1.0
type Line struct {
Level int // Hierarchy level (0 for top-level records)
Tag string // GEDCOM tag (e.g., "INDI", "NAME", "BIRT")
Value string // Optional value following the tag
Xref string // Optional cross-reference identifier (e.g., "I1" from "@I1@")
LineNumber int // Line number in the input file (1-indexed)
Offset int // Character offset in the input file
}
Line represents a single line from a GEDCOM file after tokenization. A GEDCOM line has the format: "level [xref] tag [value]" For example: "0 @I1@ INDI" or "1 NAME John /Smith/"
type MediaRecord ¶
type MediaRecord struct {
Xref string // Unique cross-reference identifier (e.g., "@M1@")
File []*FileRecord // File references for this media object
Title string // Title or description of the media
UserReference []*UserReferenceRecord // User-provided reference numbers
AutomatedRecordId string // Unique record ID assigned by the source system
Change ChangeRecord // Record of when this record was last modified
Note []*NoteRecord // Notes attached to this media
Citation []*CitationRecord // Source citations
UserDefined []UserDefinedTag // User-defined tags
}
MediaRecord represents a multimedia object such as a photo, document, or audio/video recording. It can be referenced by individuals, families, events, and sources.
type NameRecord ¶
type NameRecord struct {
Name string // Full name in GEDCOM format (e.g., "John /Smith/ Jr.")
Type string // Name type (e.g., "birth", "married", "aka")
NamePiecePrefix string // Name prefix (e.g., "Dr.", "Rev.")
NamePieceGiven string // Given name(s)
NamePieceNick string // Nickname
NamePieceSurnamePrefix string // Surname prefix (e.g., "van", "de")
NamePieceSurname string // Surname
NamePieceSuffix string // Name suffix (e.g., "Jr.", "III")
Phonetic []*VariantNameRecord // Phonetic variants of the name
Romanized []*VariantNameRecord // Romanized variants of the name
Citation []*CitationRecord // Source citations for this name
Note []*NoteRecord // Notes about this name
UserDefined []UserDefinedTag // User-defined tags
}
NameRecord represents a name for an individual. An individual may have multiple names (e.g., maiden name, married name, aliases). The Name field contains the full name in GEDCOM format: "Given Name /Surname/ Suffix". Use SplitPersonalName to parse the Name field into components.
type NoteRecord ¶
type NoteRecord struct {
Note string // The note text
Citation []*CitationRecord // Source citations for the note
}
NoteRecord contains a note or comment attached to a record.
type ParsedName ¶ added in v0.0.4
type ParsedName struct {
Full string // Reconstructed full name without GEDCOM delimiters
Given string // Given name(s) / first name(s)
Surname string // Surname / family name / last name
Suffix string // Name suffix (e.g., "Jr.", "III", "PhD")
Nickname string // Nickname, if present in quotes
}
ParsedName contains the components of a personal name after parsing with SplitPersonalName.
func SplitPersonalName ¶ added in v0.0.4
func SplitPersonalName(name string) ParsedName
SplitPersonalName parses a GEDCOM-formatted personal name into its components. GEDCOM names use slashes to delimit the surname: "Given Names /Surname/ Suffix".
Examples:
SplitPersonalName("John /Smith/")
// Returns: Given="John", Surname="Smith"
SplitPersonalName("John \"Jack\" /Smith/ Jr.")
// Returns: Given="John", Nickname="Jack", Surname="Smith", Suffix="Jr."
SplitPersonalName("Mary Jane /van der Berg/")
// Returns: Given="Mary Jane", Surname="van der Berg"
The function also handles alternative surnames separated by slashes within the surname delimiters (e.g., "/Smith/Smyth/" becomes Surname="Smith/Smyth").
type PlaceRecord ¶
type PlaceRecord struct {
Name string // Place name (jurisdiction hierarchy)
Phonetic []*VariantPlaceNameRecord // Phonetic variants of the place name
Romanized []*VariantPlaceNameRecord // Romanized variants of the place name
Latitude string // Latitude in GEDCOM format (e.g., "N50.9333")
Longitude string // Longitude in GEDCOM format (e.g., "W1.8")
Citation []*CitationRecord // Source citations
Note []*NoteRecord // Notes about the place
}
PlaceRecord represents a geographic location. The Name field typically contains a comma-separated jurisdiction hierarchy (e.g., "City, County, State, Country").
type RepositoryRecord ¶
type RepositoryRecord struct {
Xref string // Unique cross-reference identifier (e.g., "@R1@")
Name string // Name of the repository
Address AddressRecord // Address of the repository
Note []*NoteRecord // Notes about the repository
UserReference []*UserReferenceRecord // User-provided reference numbers
AutomatedRecordId string // Unique record ID assigned by the source system
Change ChangeRecord // Record of when this record was last modified
UserDefined []UserDefinedTag // User-defined tags
}
RepositoryRecord represents a repository where source documents are held, such as a library, archive, or private collection.
type ScanErr ¶ added in v0.1.0
type ScanErr struct {
Err error // The underlying error
LineNumber int // Line number where the error occurred (1-indexed)
Offset int // Character offset within the line
}
ScanErr represents a scanning error with location information. It wraps the underlying error and includes the line number and character offset where the error occurred.
type Scanner ¶ added in v0.1.0
type Scanner struct {
// contains filtered or unexported fields
}
Scanner tokenizes GEDCOM input line by line. It is a low-level component used by Decoder. Most users should use Decoder directly instead.
A Scanner reads from an io.RuneScanner and breaks the input into GEDCOM lines, parsing the level, optional cross-reference, tag, and optional value from each line.
func NewScanner ¶ added in v0.1.0
func NewScanner(r io.RuneScanner) *Scanner
NewScanner creates a new Scanner that reads from r. Use Scanner.Next to advance through the input and Scanner.Line to retrieve the current line after each successful call to Next.
func (*Scanner) Err ¶ added in v0.1.0
Err returns the first non-EOF error that was encountered by the Scanner.
type SourceCallNumberRecord ¶ added in v0.0.4
type SourceCallNumberRecord struct {
CallNumber string // The call number or shelf location
MediaType string // Type of media (e.g., "book", "microfilm")
}
SourceCallNumberRecord contains a call number for a source in a repository.
type SourceDataRecord ¶ added in v0.0.4
type SourceDataRecord struct {
Event []*SourceEventRecord // Events covered by the source
}
SourceDataRecord contains data recorded from a source.
type SourceEventRecord ¶ added in v0.0.4
type SourceEventRecord struct {
Kind string // Type of event (e.g., "BIRT", "MARR", "DEAT")
Date string // Date range covered by the source
Place string // Place jurisdiction covered by the source
}
SourceEventRecord describes an event type covered by a source.
type SourceRecord ¶
type SourceRecord struct {
Xref string // Unique cross-reference identifier (e.g., "@S1@")
Title string // Title of the source
Data *SourceDataRecord // Data recorded from the source
Originator string // Author or creator of the source
FiledBy string // Person who filed the source
PublicationFacts string // Publication information
Text string // Verbatim text from the source
Repository *SourceRepositoryRecord // Repository where the source is held
UserReference []*UserReferenceRecord // User-provided reference numbers
AutomatedRecordId string // Unique record ID assigned by the source system
Change ChangeRecord // Record of when this record was last modified
Note []*NoteRecord // Notes about the source
Media []*MediaRecord // Media objects (photos of documents, etc.)
UserDefined []UserDefinedTag // User-defined tags
}
SourceRecord represents a source of genealogical information, such as a book, document, website, or oral interview.
type SourceRepositoryRecord ¶ added in v0.0.4
type SourceRepositoryRecord struct {
Repository *RepositoryRecord // The repository holding the source
Note []*NoteRecord // Notes about the source at this repository
CallNumber []*SourceCallNumberRecord // Call numbers for locating the source
}
SourceRepositoryRecord links a source to its repository.
type SubmissionRecord ¶
type SubmissionRecord struct {
Xref string
}
SubmissionRecord contains information about a batch submission of genealogical data.
type SubmitterRecord ¶
type SubmitterRecord struct {
Xref string // Unique cross-reference identifier (e.g., "@SUBM1@")
Name string // Name of the submitter
Address *AddressRecord // Address of the submitter
Media []*MediaRecord // Media objects (e.g., photo of submitter)
Language []string // Languages used by the submitter
SubmitterRecordFileID string // Submitter record file identifier
AutomatedRecordId string // Unique record ID assigned by the source system
Note []*NoteRecord // Notes from the submitter
Change *ChangeRecord // Record of when this record was last modified
}
SubmitterRecord contains information about the person or organization that submitted the genealogical data.
type SystemRecord ¶
type SystemRecord struct {
Xref string
Version string
ProductName string
BusinessName string
Address AddressRecord
SourceName string
SourceDate string
SourceCopyright string
UserDefined []UserDefinedTag
}
A SystemRecord contains information about the system that produced the GEDCOM.
type UserDefinedTag ¶ added in v0.0.4
type UserDefinedTag struct {
Tag string
Value string
Xref string
Level int
UserDefined []UserDefinedTag
}
A UserDefinedTag is a tag that is not defined in the GEDCOM specification but is included by the publisher of the data. In GEDCOM user defined tags must be prefixed with an underscore. This is preserved in the Tag field.
type UserReferenceRecord ¶ added in v0.0.4
type UserReferenceRecord struct {
Number string // The reference number
Type string // The type of reference
}
UserReferenceRecord contains a user-defined reference number for a record.
type VariantNameRecord ¶ added in v0.2.0
type VariantNameRecord struct {
Name string // Full variant name
Type string // Variant type (e.g., "kana", "hangul" for phonetic)
NamePiecePrefix string // Name prefix
NamePieceGiven string // Given name(s)
NamePieceNick string // Nickname
NamePieceSurnamePrefix string // Surname prefix
NamePieceSurname string // Surname
NamePieceSuffix string // Name suffix
Citation []*CitationRecord // Source citations
Note []*NoteRecord // Notes
}
VariantNameRecord represents a phonetic or romanized variant of a name.
type VariantPlaceNameRecord ¶ added in v0.2.0
type VariantPlaceNameRecord struct {
Name string // The variant place name
Type string // Variant type (e.g., "kana", "hangul" for phonetic)
}
VariantPlaceNameRecord represents a phonetic or romanized variant of a place name.