2023/근복

크롤링 리스트 가져오기

notty 2023. 10. 30. 13:34
728x90
import requests
from bs4 import BeautifulSoup
import pandas as pd

# Step 1: Send an HTTP GET request to the URL
pg_num = 66
data = []
for i in range(1,pg_num+1):
    response = requests.get(url)

    # Step 2: Parse the HTML content of the page with BeautifulSoup
    soup = BeautifulSoup(response.content, 'html.parser')

    # Step 3: Locate the table and iterate through rows
    table = soup.find('table', {'class': 'table-list table-case'})  # replace 'your_table_class' with the actual class name of the table
    rows = table.find_all('tr')[1:]  # assuming the first row is the header

    # Step 4: Extract the desired data from each row
   
    for row in rows:
        cols = row.find_all('td')
        cols = [elem.text.strip() for elem in cols]
        data.append(cols)
   
   
   


# Step 5: Create a DataFrame
df = pd.DataFrame(data, columns=['연번', '신청질병 내용', '심의결과', '심의연도', '주문', '청구취지', '신청내용', '신청인주장', '진료기록 및 의학적 소견', '인정사실', '관계법령', '위원회 판단 및 결론'])

#버튼 누르기
# driver.find_elements(By.CLASS_NAME,'btn-badge')[0].click()

# Step 6: Save the DataFrame to a file
# df.to_csv('output.csv', index=False)
728x90
반응형

'2023 > 근복' 카테고리의 다른 글

크롤링 400채우기  (0) 2023.11.02
필요없는 문자 빼기  (0) 2023.10.30
크롤링 판정서 1차 문서 내용  (0) 2023.10.30