idea编译项目报找不到符号错误

可以试下下面四种方法

1、重新编译

项目右键-选择rebuild

2、重新引入mvn依赖

项目右键-Maven-Reimport

等下载完依赖后即可



3、进入到项目路径下使用命令 mvn clean install

有时项目编译不成功时 也可以解决

4、lombok报错时 : lombok requires annotation

Settings > Build, Execution, Deployment > Compiler > Annotation Processors
勾上 Enable annotation processing

然后重新编译项目 使用第1步即可


SLF4J: Class path contains multiple SLF4J bindings. springboot集成logback遇到冲突解决

springboot集成logback遇到冲突 

启动时报错如下

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/E:/Repositories/m2/org/slf4j/slf4j-log4j12/1.7.26/slf4j-log4j12-1.7.26.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/E:/Repositories/m2/ch/qos/logback/logback-classic/1.2.3/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread “main” java.lang.IllegalArgumentException: LoggerFactory is not a Logback LoggerContext but Logback is on the classpath. Either remove Logback or the competing implementation (class org.slf4j.impl.Log4jLoggerFactory loaded from file:/E:/Repositories/m2/org/slf4j/slf4j-log4j12/1.7.26/slf4j-log4j12-1.7.26.jar). If you are using WebLogic you will need to add ‘org.slf4j’ to prefer-application-packages in WEB-INF/weblogic.xml: org.slf4j.impl.Log4jLoggerFactory
at org.springframework.util.Assert.instanceCheckFailed(Assert.java:637)
at org.springframework.util.Assert.isInstanceOf(Assert.java:537)
at org.springframework.boot.logging.logback.LogbackLoggingSystem.getLoggerContext(LogbackLoggingSystem.java:286)
at org.springframework.boot.logging.logback.LogbackLoggingSystem.beforeInitialize(LogbackLoggingSystem.java:102)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationStartingEvent(LoggingApplicationListener.java:191)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEvent(LoggingApplicationListener.java:170)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.starting(EventPublishingRunListener.java:68)
at org.springframework.boot.SpringApplicationRunListeners.starting(SpringApplicationRunListeners.java:48)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:293)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1242)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1230)

at com.ipo3.ZuulGaewayApplication.main(ZuulGaewayApplication.java:17)

找到如下依赖 删除 即可

 <dependency>
         <groupId>org.slf4j</groupId>
         <artifactId>slf4j-log4j12</artifactId>
     </dependency>

python爬虫-爬亚马逊评论

主要逻辑 通过浏览器模拟正常查找产品并爬取评论 然后保存到sqlite数据库中,然后再用页面展示出来

效果

http://joyon.wang/test.php

主要用到的几个框架 及chrome  安装脚本和方式如下

python版本3.6.3

pip install selenium

pip install lxml
pip install beautifulsoup4
 
 
chrome driver 放在python的Scripts目录下

https://npm.taobao.org/mirrors/chromedriver/70.0.3538.67/

程序主代码

amazon.py

from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium import webdriver
from bs4 import BeautifulSoup
import lxml.html
import time
import re

import io  
import sys
#导入SQLite驱动:
import sqlite3
#连接到SQLite数据库
#数据库文件是test.db
#如果文件不存在,会自动在当前目录创建:
conn=sqlite3.connect('typec.db')
#创建一个Cursor:

sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='gbk') #改变标准输出的默认编码


HOST='https://www.amazon.com/'
START_PAGE=6
SERVICE_ARGS = ['--load-images=false', '--disk-cache=true']
KEYWORD = 'type-c usb cable'
#client = pymongo.MongoClient(MONGO_URL)
#db = client[MONGO_DB]

browser = webdriver.Chrome()
# browser = webdriver.Firefox()
wait = WebDriverWait(browser, 10)
browser.set_window_size(1400, 900)

#搜索出关键词
def search():
    print('正在搜索')
    try:
        browser.get('https://www.amazon.com/')
        input = wait.until(
            EC.presence_of_element_located((By.CSS_SELECTOR, '#twotabsearchtextbox'))
        )
        submit = wait.until(
            EC.element_to_be_clickable((By.CSS_SELECTOR, '#nav-search > form > div.nav-right > div > input')))
        input.send_keys(KEYWORD)
        submit.click()
        total = wait.until(
            EC.presence_of_element_located((By.CSS_SELECTOR, '#pagn > span.pagnDisabled')))
        #get_products()
        print('一共' + total.text + '页')
        return total.text
    except TimeoutException:
        return search()

#产品翻页
def next_page(number):
    print('产品正在翻页', number)
    try:
        if number<START_PAGE :
          return
        wait.until(EC.text_to_be_present_in_element(
            (By.CSS_SELECTOR, '#pagnNextString'), 'Next Page'))
        submit = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#pagnNextString')))
        submit.click()
        wait.until(EC.text_to_be_present_in_element(
            (By.CSS_SELECTOR, '.pagnCur'), str(number)))
        time.sleep(1)
        get_products()
    except TimeoutException:
        next_page(number)

#获取当前页产品 并循环遍历
def get_products():
    try:
        print('开始搜索商品信息')
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#s-results-list-atf')))
        print('列表已经找出')
        html = browser.page_source
        soup = BeautifulSoup(html, 'lxml')
        doc = lxml.html.fromstring(html)
        print(doc)
        #date = doc.xpath(r'//*[@class="s-result-item s-result-card-for-container-noborder s-carded-grid celwidget  "]/div/div[3]/div[1]/span[2]/text()')
        #print(date)
        content = soup.find_all(attrs={"id": re.compile('result_\d+')})
        #print(content
        
      
        #遍历产品
        for item in content:
            #if 'Top Rated from Our Brands' ==item.find(class_='a-size-medium a-color-base')
            if item.find(class_='a-size-medium a-color-base') != None:
                continue
                print(item.find(class_='a-size-medium a-color-base').get_text())
            #print(item)
            p = item.find(class_='sx-price-whole')
            s = item.find(class_='sx-price-fractional')
            price = '0'
            if p != None  and s != None:
               price=p.get_text()+'.'+s.get_text()
            
            url = item.find('a',class_='a-link-normal a-text-normal').get('href')
            #print(url)
           
            #continue
            #url = item.find('a').get('href')
            if HOST in url:
               print(url)
            else:
              url = HOST+url
            product = {
                'title': item.find('h2').get_text().replace('[Sponsored]',''),
                'price': price,
                'image': item.find(class_='s-access-image cfMarker').get('src'),
                'url': url
            }
            #print(product)
            #产品保存到数据库
            product = insert_product(product)
            print(product)

            #save_to_mongo(product)
            #打开产品页面
            open_product(product)
            #切换回第一个窗口
            #if browser.window_handles[1]:
            #关掉当前产品页
            browser.close()
            browser.switch_to_window(browser.window_handles[0])

            
            
    except Exception as e:
        print(e)
#评论翻页
def review_next(number):
    print('正在翻页', number)
    try:
        #等待页面加载成功
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#cm_cr-review_list')))

        #wait.until(EC.text_to_be_present_in_element(
        #    (By.CSS_SELECTOR, '.a-last'), 'Next Page'))
        continue_link = browser.find_element_by_partial_link_text('→')#browser.find_element(By.XPATH, '//button[text()="Next  →"]')#find_element_by_partial_link_text('Next  →')
        #submit = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.a-last')))
        continue_link.click()
        #wait.until(EC.text_to_be_present_in_element(
        #    (By.CSS_SELECTOR, 'a-selected page-button'), str(number)))
        
		
        print('翻页成功', number)
        return True
        #get_products()
    except Exception as e:
        print('翻页出错了',number,e)
        return False

#打开单个产品
def open_product(product):
    print('打开产品', product['title'])
    try:
        #print(product)
        #purl = item.find('a').get('href')
        #print(product['title'])
        #purl = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 's-access-image cfMarker')))
        #purl.click()
        #在新窗口中打开产品
        browser.execute_script('window.open()')
        #print(browser.window_handles)
        #转到新窗口
        browser.switch_to_window(browser.window_handles[1])
        browser.get(product['url'])
        
        #等待评论div加载完成  如果无评论的话则不遍历评论
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#reviewsMedley')))
        html = browser.page_source
        soup = BeautifulSoup(html, 'lxml')
        review_count = soup.find(attrs={'data-hook':'total-review-count'}).get_text()
        if 'No' in review_count:
            print('无评论')
            return

        #查找评论  等待评论url加载完成
        commentAll = wait.until(
            EC.element_to_be_clickable((By.CSS_SELECTOR, '#dp-summary-see-all-reviews')))
       
        #打开评论页面
        commentAll.click()
        #遍历评论并保存
        #browser.switch_to_window(browser.window_handles[2])
        list_review(product,1)
        #a-disabled a-last
        #评论翻页
        page = 2
        while True  :
           result = review_next(page)
           if(result): 
             page = page+1
             #等一秒 因为ajax刷新页面需要时间
             time.sleep(1)
             list_review(product,page)

           else:
             break
    except TimeoutException:
            print(e)

#获取评论列表 并保存到数据库
def list_review(product,num):
        print('开始查找评论列表 第'+str(num)+'页')
        #等待评论页面加载完成
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#cm_cr-review_list')))
        print('评论列表已经找出 第'+str(num)+'页')
        html = browser.page_source
        soup = BeautifulSoup(html, 'lxml')
        review_count = soup.find(attrs={'data-hook':'total-review-count'}).get_text()
        #print('总评论数:'+review_count)
        doc = lxml.html.fromstring(html)
        #print(doc)
        #print(date)
        content = soup.find_all(class_='a-section celwidget')
        #print(content
        #遍历当前页面评论
        for item in content:
            #print(item)
            comment = {
                'user':item.find(class_='a-profile-name').get_text(),
                'title':item.find(attrs={'data-hook':'review-title'}).get_text(),
                'star':item.find(attrs={'data-hook':'review-star-rating'}).get_text()[0:1],
                'date':item.find(attrs={'data-hook':'review-date'}).get_text(),
                'text':item.find(attrs={'data-hook':'review-body'}).get_text(),
                'productId':product['id']
                
            }
            #插入数据库
            insert_review(comment)
            #print(comment)
#产品插入数据库
def insert_product(product):
    cursor=conn.cursor()
    cursor.execute("insert into product(title,price,img,url) values(?,?,?,?)", (product['title'],product['price'],product['image'],product['url']))
    #cursor.connection.commit()
    #print(cursor.rowcount)
    #拿到插入数据库的产品的id 用来关联评论
    cursor.execute("select max(id) from product")
    #print(cursor.rowcount)
    id = cursor.fetchone()
    #print(id)
    product['id'] = id[0]
    cursor.close()
    conn.commit()
    return product
#评论插入数据库   
def insert_review(review):
    cursor=conn.cursor()
    cursor.execute("insert into review(product_id,user,title,star,date,text) values(?,?,?,?,?,?)", (review['productId'],review['user'],review['title'],review['star'],review['date'],review['text']))
    cursor.close()
    conn.commit()

    
#主入口
def main():
    try:
        total = int(search())
        #time.sleep(5)
        if START_PAGE==1 :
            get_products()
        for i in range(2, total + 1):
            next_page(i)
    except Exception as e:
        print('出错啦', e)
    finally:
        conn.close()
        #browser.close()


if __name__ == '__main__':
    main()

sqlite创建脚本

sqlite.py

#导入SQLite驱动:
import sqlite3
#连接到SQLite数据库
#数据库文件是test.db
#如果文件不存在,会自动在当前目录创建:
conn=sqlite3.connect('typec.db')
#创建一个Cursor:
cursor=conn.cursor()
#执行一条SQL语句,创建user表:
cursor.execute('create table product(id integer primary key AUTOINCREMENT,title varchar(200),price REAL,img varchar(500),url varchar(500))')

cursor.execute('create table review(id integer primary key AUTOINCREMENT,product_id integer,user varchar(30),title varchar(200),star integer,date varchar(20),text varchar(2000))')
#cursor.execute("insert into product(title,price,img,url) values(?,?,?,?)", ('2','2','2','2'))

cursor.execute("select max(id) from product")

id = cursor.fetchone()
print(id[0])

 #cursor.execute('insert into user(id,name)values(\'1\',\'Michael\')')
#继续执行一条SQL语句,插入一条记录:
#cursor.execute('insert into user(id,name)values(\'1\',\'Michael\')')

#通过rowcount获得插入的行数:
cursor.rowcount

#关闭Cursor:
cursor.close()
#提交事务:
conn.commit()
#关闭Connection:
conn.close()

做完上面的操作之后直接运行 

sqlite.py

amazon.py

就会自动开始在亚马逊爬取评论并保存到sqlite中

test1

testtesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttest

test

testtesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttest

linux安装mysql

执行命令
    yum install mysql
    yum install mysql-server
    yum install mysql-devel
如果在安装 mysql-server  时报错: No package mysql-server available.
解决:
     wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
     rpm -ivh mysql-community-release-el7-5.noarch.rpm
查看:
      ls -1 /etc/yum.repos.d/mysql-community*
出现:
    /etc/yum.repos.d/mysql-community.repo
    /etc/yum.repos.d/mysql-community-source.repo
最后:
   yum install mysql-server

安装成功!