gospider

command module
v0.0.0-...-28f2d27 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 4, 2021 License: Apache-2.0 Imports: 1 Imported by: 0

README

Go爬虫框架

简洁明了的用户界面

使用cobra构建命令行基本程序

chromedp

调用chromeApi后台访问网页

因为vps服务器上没有安装chrome,所以我们可以使用docker的镜像来调用

docker-compose.yml文件
version: '3'
services:
  db:
    image: postgres
    container_name: db
    restart: always
    ports: 
      - 5432:5432
    environment:
      - POSTGRES_USER=spider
      - POSTGRES_PASSWORD=20090909
      - POSTGRES_DB=spider
    volumes:
      - postgres-data:/var/lib/postgresql/data/ 
  adminer:
    image: adminer
    container_name: adminer
    links:
      - db
    restart: always
    environment:
      - POSTGRES_HOST=db
      - POSTGRES_PORT=5432
      - POSTGRES_USER=spider
      - POSTGRES_PASSWORD=20090909
      - POSTGRES_DB=spider
    ports:
      - 9433:8080
  spider:
    image: chromedp/headless-shell:latest
    container_name: spider
    depends_on:
      - db
    links:
      - db
    restart: always
    ports:
      - 9222:9222
    environment:
      - POSTGRES_HOST=db
      - POSTGRES_PORT=5432
      - POSTGRES_USER=spider
      - POSTGRES_PASSWORD=20090909
      - POSTGRES_DB=spider
    volumes:
      - /root/go/bin:/root/go/bin
      - ./crontab_job:/etc/cron.d/container_cronjob
    command:
      - chmod 644 /etc/cron.d/container_cronjob && cron

volumes:
  postgres-data: 

运行命令docker-compose -up将启动

![docker chrome spider](./docs/assets/docker-chrome-spider.png]

goquery

查询页面元素

go prisma

prisma.schema文件
generator db {
  provider = "go run github.com/prisma/prisma-client-go"
}

datasource db {
  provider = "postgresql"
  url      = "postgresql://spider:20090909@db:5432/spider?schema=public" 
}

model movies {
  id          Int      @id @default(autoincrement())
  title       String?
  subtitle    String?
  other       String?
  desc        String?
  year        String?
  area        String?
  tag         String?
  star        String?
  comment     String?
  quote       String?
  created_at  DateTime @default(now())
  updated_at  DateTime @default(now())
}
使用prismago语言客户端,更优于使用gorm等框架
go run github.com/prisma/prisma-client-go generate
设置环境参数
  1. 如果的vps中运行,修改hosts文件,将db指向127.0.0.1主机

  2. 如果在docker中运行,修改docker-compose.yml文件,指向db容器

使用deno初始化数据库

echo "postgresql://spider:20090909@db:5432/spider?schema=public"

deno run -A --unstable https://raw.githubusercontent.com/linuxing3/gospider/main/create_table.ts"

使用go admin构建前端界面

go install github.com/GoAdminGroup/go-admin/adm

Documentation

Overview

Copyright © 2021 NAME HERE <EMAIL ADDRESS>

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL