proxerscrape

package module
v0.0.0-...-643a26e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 18, 2022 License: BSD-3-Clause Imports: 15 Imported by: 0

README

Proxer Profile Scraper

For now this only takes the Anime section of the profile as an HTML file and parses the four tables into arrays of animes.

There's a simple file that can show you what you still "have" to watch and how long it'll take. For example to run it on my profile, you could do the following:

curl https://proxer.me/user/252835/anime | go run .

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func QueryDirectly

func QueryDirectly(url string) (*http.Response, error)

Types

type Cache

type Cache struct {
	QueryMedia                 func(*Media) (*http.Response, error)
	QueryProfileTab            func(string, ProfileTabType) (*http.Response, error)
	AnimeQueryRatelimiter      *Limiter
	MangaQueryRatelimiter      *Limiter
	ProfileTabQueryRatelimiter *Limiter
}

func CreateDefaultCache

func CreateDefaultCache() *Cache

func (*Cache) RetrieveAnimeRawData

func (cache *Cache) RetrieveAnimeRawData(item *Media) (io.ReadCloser, CacheInvalidator, error)

RetrieveAnimeRawData retrieves the HTML page for a media entry, which could for example be an anime or a manga, allowing further processing to retrieve additional information. If any data has been found both a reader and an invalidator is returned. The invalidator can be used if whatever instance receiveing the data, deems that it is invalid an should be removed from cache.

func (*Cache) RetrieveMangaRawData

func (cache *Cache) RetrieveMangaRawData(item *Media) (io.ReadCloser, CacheInvalidator, error)

RetrieveMangaRawData retrieves the HTML page for a media entry, which could for example be an anime or a manga, allowing further processing to retrieve additional information. If any data has been found both a reader and an invalidator is returned. The invalidator can be used if whatever instance receiveing the data, deems that it is invalid an should be removed from cache.

func (*Cache) RetrieveProfileTabRawData

func (cache *Cache) RetrieveProfileTabRawData(profileId string, tabType ProfileTabType) (io.ReadCloser, CacheInvalidator, error)

type CacheInvalidator

type CacheInvalidator func() error

CacheInvalidator is a simple interface to make sure the caller of RetrieveAnimeRawData know what the second parameter means. The invalidator is used for removing an item from cache. This can be used by a parser if it deems an item to be invalid.

type Limiter

type Limiter struct {
	// contains filtered or unexported fields
}

func NewLimiter

func NewLimiter(tries int, per time.Duration) *Limiter

func (*Limiter) Wait

func (limiter *Limiter) Wait()

type Media

type Media struct {
	EpisodesWatched uint16
	EpisodeCount    uint16
	Title           string
	Type            MediaType
	ProxerURL       string
	Status          Status

	EnglishTitle  string
	GermanTitle   string
	JapaneseTitle string
	Synonyms      []string
	Rating        float64
	ReleasePeriod ReleasePeriod
	Generes       []string
}

Media is the base for different types of media, such as Media or Manga. Note that names such as `EpisodesWatched` are anime specific, but work for Manga chapters as well. While use of interface would make the API more clean, I simply don't care ;).

type MediaRawDataRetriever

type MediaRawDataRetriever func(*Media) (io.ReadCloser, CacheInvalidator, error)

type MediaType

type MediaType string
const (
	Series  MediaType = "Animeserie"
	Special MediaType = "Special"
	Movie   MediaType = "Movie"

	Manga     MediaType = "Mangaserie"
	Webtoon   MediaType = "Webtoon"
	Manhwa    MediaType = "Manhwa"
	Doujinshi MediaType = "Doujinshi"
)

type ProfileTabType

type ProfileTabType string
const (
	ProfileTabAnime ProfileTabType = "anime"
	ProfileTabManga ProfileTabType = "manga"
	ProfileTabNovel ProfileTabType = "novel"
)

type ReleasePeriod

type ReleasePeriod struct {
	FromSeason Season
	FromYear   uint

	ToSeason Season
	ToYear   uint
}

type Season

type Season string

Season represents the four seasons of the year. Proxer.me represents these as integers internally.

const (
	// Winter
	Q1 Season = "Q1"
	// Spring
	Q2 Season = "Q2"
	// Summer
	Q3 Season = "Q3"
	// Autumn
	Q4 Season = "Q4"
)

type Status

type Status string
const (
	// Finished means all episodes of have been released. This doesn't imply
	// that all seasons have the same status.
	Finished Status = "Abgeschlossen"
	// PreAiring means yet to be released.
	PreAiring Status = "Nicht erschienen (Pre-Airing)"
	// Airing means the series has been released, but not all episodes have
	// been released yet.
	Airing Status = "Airing"
)

type Watchlist

type Watchlist struct {
	Watched           WatchlistCategory
	CurrentlyWatching WatchlistCategory
	ToWatch           WatchlistCategory
	StoppedWatching   WatchlistCategory
}

Watchlist holds the different types of watchlists for a profile.

func ParseProfileMediaTab

func ParseProfileMediaTab(reader io.Reader) (Watchlist, error)

ParseProfileMediaTab takes an HTML dump any type of `Media` tab, such as `Anime` of a profile and parses the contained watchlists. Note that the resulting Watchlist only contains certaindata. You'll have to call WatchlistCategory.LoadExtraData on the respective lists if you require additional data.

type WatchlistCategory

type WatchlistCategory struct {
	Data []*Media
	// contains filtered or unexported fields
}

func (*WatchlistCategory) LoadExtraData

func (wc *WatchlistCategory) LoadExtraData(retrieveRawData MediaRawDataRetriever) error

LoadExtraData will retrieve additional information for all animes in this category and load it into the respective *Anime. Calling this a second time will not have an effect.

Directories

Path Synopsis
cmd
cli command
watchnext command
watchtimeleft command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL