set

package
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2026 License: MIT Imports: 13 Imported by: 0

README

SET data (Thai Stock Exchange)

Goal is to have some API allowing us to get the SET data of a given company name in english and thai from either it's symbol or company name. It should allow us to translate a thai name into its english equivalent.

Installation

go get github.com/jwitmann/thai-market-data/set

Usage

import "github.com/jwitmann/thai-market-data/set"

// Create client
client, err := set.NewClient(dataDir)
if err != nil {
    log.Fatal(err)
}

// Get company by ticker symbol
company, err := client.GetBySymbol("PTT")
// Returns: {NameEN: "PTT PUBLIC COMPANY LIMITED", NameTH: "บริษัท ปตท. จำกัด (มหาชน)", ...}

// Search by name (fuzzy matching)
company, err := client.GetByName("PTT PUBLIC COMPANY LIMITED")
company, err := client.GetByName("บริษัท ปตท. จำกัด (มหาชน)")

// Check if a name contains Thai characters
isThai := client.IsThaiName("บริษัท ปตท.")

// Translate Thai company name to English
englishName := client.TranslateName("หุ้นสามัญของบริษัท ปตท. จำกัด (มหาชน)")
// Returns: "PTT PUBLIC COMPANY LIMITED"
// (Strips "หุ้นสามัญของ" prefix and looks up company)

// Translate Thai sector name to English
sectorEN := client.TranslateSector("พลังงานและสาธารณูปโภค")
// Returns: "Energy & Utilities"

// Translate Thai industry name to English
industryEN := client.TranslateIndustry("ทรัพยากร")
// Returns: "Resources"

Auto-Update

The client supports automatic updates from the SET website:

// Check if data needs update (older than 30 days)
if client.NeedsUpdate() {
    // Update in background
    go func() {
        if err := client.FetchAndUpdate(); err != nil {
            log.Printf("SET update failed: %v", err)
        }
    }()
}

Architecture

Data Flow
  1. Data Loading:

    • Downloads HTML tables from SET.or.th (English and Thai versions)
    • Parses and converts TIS-620 encoding to UTF-8
    • Merges data by symbol (ticker)
    • Creates unique IDs for industries and sectors
    • Outputs ${DATA_DIR}/SET_mappings.json
  2. JSON Structure:

{
  "companies": {
    "PTT": {
      "name_en": "PTT PUBLIC COMPANY LIMITED",
      "name_th": "บริษัท ปตท. จำกัด (มหาชน)",
      "market": "SET",
      "industry_id": "ind_007",
      "sector_id": "sec_005"
    }
  },
  "industries": {
    "ind_007": {
      "name_th": "ทรัพยากร",
      "name_en": "Resources"
    }
  },
  "sectors": {
    "sec_005": {
      "name_th": "พลังงานและสาธารณูปโภค",
      "name_en": "Energy & Utilities"
    }
  }
}
  1. Runtime API (set/client.go):
    • Loads JSON file on startup
    • Provides lookup and translation functions
    • Maintains in-memory cache for fast access
Company Structure
type Company struct {
    NameEN     string // English company name
    NameTH     string // Thai company name
    Market     string // "SET" or "mai"
    IndustryID string // Reference to industries map
    SectorID   string // Reference to sectors map
}

type Industry struct {
    NameTH string // Thai industry name
    NameEN string // English industry name
}

type Sector struct {
    NameTH string // Thai sector name
    NameEN string // English sector name
}

Data Sources

The SET client downloads data directly from SET.or.th. The HTML tables contain the following fields:

English version (SET_listedCompanies_en.csv):

  • Symbol, Company, Market, Industry, Sector, Address, Zip code, Tel., Fax, Website

Thai version (SET_listedCompanies_th.csv):

  • หลักทรัพย์, บริษัท, ตลาด, กลุ่มอุตสาหกรรม, หมวดธุรกิจ, ที่อยู่, รหัสไปรษณีย์, โทรศัพท์, โทรสาร, เว๊บไซต์

Fuzzy Matching Algorithm

The GetByName function uses aggressive fuzzy matching:

  1. Exact match on uppercase name
  2. Case-insensitive exact match
  3. Substring match (contains) - search query contained in company name
  4. Progressive word removal for Thai names:
    • "บริษัท ปตท. จำกัด (มหาชน)" → not found
    • "บริษัท ปตท. จำกัด" → not found
    • "บริษัท ปตท." → not found
    • "บริษัท" → found (matches any company with "บริษัท")

Examples

Lookup by Symbol
company, _ := client.GetBySymbol("KBANK")
fmt.Println(company.NameEN)  // "KASIKORNBANK PUBLIC COMPANY LIMITED"
fmt.Println(company.NameTH)  // "ธนาคารกสิกรไทย จำกัด (มหาชน)"
// Thai name with prefix
company, _ := client.GetByName("หุ้นสามัญของบริษัท ปตท. จำกัด (มหาชน)")
// Works because fuzzy matching finds "บริษัท ปตท. จำกัด (มหาชน)"

// Partial match
company, _ := client.GetByName("KASIKORNBANK")
// Works because "KASIKORNBANK" is contained in "KASIKORNBANK PUBLIC COMPANY LIMITED"
Portfolio Translation
// Translate portfolio holdings
for i, holding := range portfolio.TopHoldings {
    if translated := client.TranslateName(holding.Name); translated != "" {
        portfolio.TopHoldings[i].Name = translated
    }
}
// Each Thai company name is replaced with its English equivalent

Documentation

Index

Constants

View Source
const (
	SETURLEN = "https://www.set.or.th/dat/eod/listedcompany/static/listedCompanies_en_US.xls"
	SETURLTH = "https://www.set.or.th/dat/eod/listedcompany/static/listedCompanies_th_TH.xls"
)

Variables

This section is empty.

Functions

func FetchAndSaveNew

func FetchAndSaveNew(dataDir string) error

FetchAndSaveNew fetches SET data from the website and saves to the given data directory This is a standalone function to handle the case where no existing data exists

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client provides SET data lookup and translation services

func NewClient

func NewClient(dataDir string) (*Client, error)

NewClient creates a new SET client with the given data directory

func (*Client) FetchAndUpdate

func (c *Client) FetchAndUpdate() error

FetchAndUpdate downloads latest SET data from official SET website Returns error on failure but existing data remains usable

func (*Client) GetByName

func (c *Client) GetByName(name string) (*Company, error)

GetByName searches for a company by name (exact or fuzzy match)

func (*Client) GetBySymbol

func (c *Client) GetBySymbol(symbol string) (*Company, error)

GetBySymbol looks up a company by its ticker symbol

func (*Client) GetCustomTranslation

func (c *Client) GetCustomTranslation(thaiText, transType string) (string, bool)

GetCustomTranslation looks up a custom translation

func (*Client) GetSETData

func (c *Client) GetSETData() *SETData

GetSETData returns the raw SET data (for advanced use cases)

func (*Client) IsThaiName

func (c *Client) IsThaiName(name string) bool

IsThaiName checks if a string contains Thai characters

func (*Client) LogUntranslated

func (c *Client) LogUntranslated(thaiText, transType, fundID string)

LogUntranslated records an item that couldn't be translated

func (*Client) NeedsUpdate

func (c *Client) NeedsUpdate() bool

NeedsUpdate checks if SET data needs to be updated based on last update timestamp Returns true if data is older than 30 days or has never been updated

func (*Client) SetCustomTranslation

func (c *Client) SetCustomTranslation(thaiText, englishText, transType string, verified bool)

SetCustomTranslation adds or updates a custom translation

func (*Client) TranslateIndustry

func (c *Client) TranslateIndustry(thaiIndustry string) string

TranslateIndustry translates a Thai industry name to English

func (*Client) TranslateName

func (c *Client) TranslateName(thaiName string) string

TranslateName translates a Thai company name to English

func (*Client) TranslateSector

func (c *Client) TranslateSector(thaiSector string) string

TranslateSector translates a Thai sector name to English

func (*Client) TranslateWithFallback

func (c *Client) TranslateWithFallback(thaiText, transType, fundID string) string

TranslateWithFallback attempts translation using multiple sources: 1. Custom translations (user-defined) 2. Hardcoded Finnomena mappings 3. SET data (official Stock Exchange of Thailand) 4. Returns original if no translation found

type Company

type Company struct {
	NameEN     string `json:"name_en"`
	NameTH     string `json:"name_th"`
	Market     string `json:"market"`
	IndustryID string `json:"industry_id"`
	SectorID   string `json:"sector_id"`
}

Company represents a SET-listed company

type CustomTranslations

type CustomTranslations struct {
	Version      int                    `json:"version"`
	Sectors      map[string]Translation `json:"sectors"`
	Industries   map[string]Translation `json:"industries"`
	Companies    map[string]Translation `json:"companies"`
	Untranslated []UntranslatedEntry    `json:"untranslated_log,omitempty"`
}

CustomTranslations stores user-defined translations

type Industry

type Industry struct {
	NameTH string `json:"name_th"`
	NameEN string `json:"name_en"`
}

Industry represents an industry classification

type SETData

type SETData struct {
	Companies  map[string]Company  `json:"companies"`
	Industries map[string]Industry `json:"industries"`
	Sectors    map[string]Sector   `json:"sectors"`
	Metadata   SETMetadata         `json:"metadata,omitempty"`
}

SETData contains the complete SET exchange data

type SETMappings

type SETMappings struct {
	Companies  map[string]Company  `json:"companies"`
	Industries map[string]Industry `json:"industries"`
	Sectors    map[string]Sector   `json:"sectors"`
	Metadata   SETMetadata         `json:"metadata"`
}

type SETMetadata

type SETMetadata struct {
	LastUpdate  string `json:"last_update"`
	SourceURLEN string `json:"source_url_en,omitempty"`
	SourceURLTH string `json:"source_url_th,omitempty"`
	RecordCount int    `json:"record_count"`
}

SETMetadata contains metadata about the SET data source

type Sector

type Sector struct {
	NameTH string `json:"name_th"`
	NameEN string `json:"name_en"`
}

Sector represents a sector classification

type Translation

type Translation struct {
	EN       string `json:"en"`
	Verified bool   `json:"verified"`
	Date     string `json:"date"`
}

Translation represents a single translation entry

type UntranslatedEntry

type UntranslatedEntry struct {
	Text   string `json:"text"`
	Type   string `json:"type"`
	Date   string `json:"date"`
	FundID string `json:"fund_id,omitempty"`
}

UntranslatedEntry tracks items that couldn't be translated

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL