google_groups_crawler

package module
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 20, 2021 License: Apache-2.0 Imports: 5 Imported by: 1

README

Google Groups Crawler

Installation

import crawler "github.com/casbin/google-groups-crawler"

Usage

We must get an instance of GoogleGroup first:

group := crawler.NewGoogleGroup(groupName string, cookie ...string)

The second parameter cookie is optional. Google group won't tell you email address of all repliers until you logged in, so you need to fill the parameter with a logged-in user's cookie. (Of course, this user must be a member of the group)

It is OK to leave cookie blank, code still works. But AuthorEmail in GoogleGroupMessage will be empty. If you do need cookie to access emails of repliers, please follow these steps:

  • open Google Chrome (or another browser) and login
  • Navigate to Google Group, select the group you want to craw
  • Press F12, and select network
  • Select a conversation (any conversation in this group is OK)
  • Select the first item in the request list
  • Select Headers
  • In Request Headers, right click cookie, and copy the value
  • Fill the parameter cookie with what you copied
Get all conversations of the group
  • For some special reasons, you cannot access Google Groups in some area. You can set up a http proxy, and fill the parameter http.Client with it. If you can access Google Groups directly, then you can just fill the parameter like the example code.
  • this function returns an array of GoogleGroupConversation
conversations := group.GetConversations(http.Client{})
Get all messages of the conversation
  • conversation is an instance of GoogleGroupConversation
  • parameter http.Client is the same effect as above
  • this function returns an array of GoogleGroupMessage
messages := conversation.GetAllMessages(http.Client{}, removeGmailQuote)

Data Structure

type GoogleGroup struct {
    GroupName string
    Cookie    string
}

type GoogleGroupConversation struct {
    Title     string
    Id        string
    GroupName string
    Time      float64
    Cookie    string
}

type GoogleGroupMessage struct {
    Author      string
    AuthorEmail string
    Content     string
    Time        float64
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type GoogleGroup

type GoogleGroup struct {
	GroupName string
	Cookie    string
}

func NewGoogleGroup

func NewGoogleGroup(name string, cookie ...string) GoogleGroup

func (GoogleGroup) GetAllConversations added in v0.1.0

func (g GoogleGroup) GetAllConversations(client http.Client) []GoogleGroupConversation

type GoogleGroupConversation

type GoogleGroupConversation struct {
	Title     string
	Id        string
	GroupName string
	Time      float64
	Cookie    string
}

func (GoogleGroupConversation) GetAllMessages

func (c GoogleGroupConversation) GetAllMessages(client http.Client, removeGmailQuote bool) []GoogleGroupMessage

type GoogleGroupFile added in v0.1.3

type GoogleGroupFile struct {
	FileName string
	Url      string
	Type     string
}

type GoogleGroupMessage

type GoogleGroupMessage struct {
	Author      string
	AuthorEmail string
	Content     string
	Time        float64
	Files       []GoogleGroupFile
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL