(Add a star to Python developers to enhance Python skills)
Author: Fanastr (This article is contributed by the author, see the end for a brief introduction)
The reason for writing this article is mainly because I read another article.
“After using Python to scrape these beautiful goddesses on Douyin, I suddenly became a winner in life,” the article briefly describes how an engineer used Python + ADB + Tencent’s AI to follow over a thousand beautiful girls in one night.
This fully reflects the difference between college students and workers in the factory, and I must say ××× is impressive…
Once, I was also alone in that huge factory contemplating life, thinking about where I should go.
I remember that I also spent those torturous days by scrolling through Douyin.
It’s just that I wasn’t as skilled as the big guy above; I manually identified beautiful girls…
Even now, I have registered for Tencent’s AI account, but I still don’t know how to use it.
So let’s start with something simple, pre-follow, and then use Python to automate the downloading of street shooting videos!!!
/ 01 / Charles
Use Charles to find the video API interface, the specific operation is the same as the previous Dangdang case, so I won’t elaborate.
By sliding the Douyin App, you can obtain the video request information.
Through multiple experiments, I found that the links keep changing at the end, but the beginning of the links remains constant, namely “http://v1-dy” “http://v6-dy” and “http://v9-dy” remain unchanged.
So when writing the script, you can use this information as the link prefix.
/ 02 / mitmproxy
Utilize the mitmdump component in mitmproxy, to interface with the Python script and implement processing after listening with Python.
Here I only use the script to obtain the link, and I did not directly use the script to download the video.
Because I run the script in the folder where the mitmdump.exe file is located, the script cannot import the requests module.
I don’t want to deal with those annoying environment variables, so I only obtain the link.
Then go to download the video; the video links need to be deduplicated, as there may be duplicates.
The Python script is as follows.
def response(flow):
urls = ['http://v1-dy', 'http://v3-dy', 'http://v6-dy', 'http://v9-dy']
# Filter the URLs, only select video URLs
for url in urls:
if url in flow.request.url:
print('抖音视频')
with open('douyin.csv', 'a+', encoding='utf-8-sig') as f:
f.write(flow.request.url + '\n')
/ 03 / Appium
Configure the Appium parameters for Douyin.
Click the blue button, and the phone will automatically launch the Douyin App!
Next, operate the phone, and then click the refresh button in Appium to obtain the element location code.
Through this practice, I found that Appium sometimes cannot accurately obtain the element’s location, which may be similar to the iframe pages on the web.
So for elements that cannot be found, I directly click on the screen position of the phone.
Since everyone’s phone screen size is different, this parameter will definitely change, so there are drawbacks and it cannot be universal.
{ Swipe left and right to switch images }
The general operation is as shown in the above images. The UP owner’s homepage image is missing, please imagine it yourself. The Python code is as follows.
import time
import random
from appium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from appium.webdriver.common.touch_action import TouchAction
from selenium.webdriver.support import expected_conditions as EC
def main():
# Set driver configuration
server = 'http://localhost:4723/wd/hub'
desired_caps = {
'platformName': 'Android',
'deviceName': 'STF_AL00',
'appPackage': 'com.ss.android.ugc.aweme',
'appActivity': '.main.MainActivity',
# Disable the phone's soft keyboard
'unicodeKeyboard': True,
'resetKeyboard': True
}
driver = webdriver.Remote(server, desired_caps)
wait = WebDriverWait(driver, 60)
# Agree to the user privacy agreement, click
button_1 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/q6')))
button_1.click()
# Deny phone permission, click
button_2 = wait.until(EC.presence_of_element_located((By.ID, 'com.android.packageinstaller:id/permission_deny_button')))
button_2.click()
# Deny location permission, click
button_3 = wait.until(EC.presence_of_element_located((By.ID, 'com.android.packageinstaller:id/permission_deny_button')))
button_3.click()
time.sleep(2)
# Swipe up to enter the Douyin video playback page
TouchAction(driver).press(x=515, y=1200).move_to(x=515, y=1000).release().perform()
# A longer delay is needed here because Douyin has guided operations and prompts, so wait a moment
time.sleep(20)
# Click on Douyin's "Like" to enter the login interface
TouchAction(driver).press(x=950, y=800).release().perform()
# Click password login
button_4 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/afg')))
button_4.click()
# Enter account
button_5 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/ab_')))
button_5.send_keys('your account')
# Enter password
button_6 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/aes')))
button_6.send_keys('your password')
time.sleep(2)
# Because the soft keyboard will pop up and block the login button, need to click to dismiss the soft keyboard
TouchAction(driver).press(x=980, y=1850).release().perform()
time.sleep(2)
# Click the login button
button_7 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/abb')))
button_7.click()
time.sleep(2)
# Successfully logged in, enter the Douyin video interface, click the bottom title bar "Me"
TouchAction(driver).press(x=990, y=1850).release().perform()
# Enter personal homepage, click on the follow section
button_8 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/a_7')))
button_8.click()
# Enter the follow section, click the second follow
button_9 = wait.until(EC.presence_of_element_located((By.XPATH, ' /hierarchy/android.widget.FrameLayout/android.widget.LinearLayout/android.widget.FrameLayout/android.widget.RelativeLayout/android.widget.LinearLayout/android.widget.FrameLayout/android.view.ViewGroup/android.widget.LinearLayout/android.support.v7.widget.RecyclerView/android.widget.RelativeLayout[2]/android.widget.RelativeLayout[1]')))
button_9.click()
# Enter the UP owner's homepage, click the first video
button_10 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/aqm')))
button_10.click()
# Continuously scroll down the page until the bottom
while True:
TouchAction(driver).press(x=515, y=1247).move_to(x=515, y=1026).release().perform()
time.sleep(float(random.randint(5, 10)))
if __name__ == '__main__':
main()
Download video code, need to deduplicate video links.
import pandas as pd
import requests
import os
num = 0
dom = []
folder_path = "F:/video/"
os.makedirs(folder_path)
df = pd.read_csv('douyin.csv', header=None, names=["url"])
# Deduplicate links and remove the video links obtained just after entering Douyin
for i in df['url'][2:]:
if i not in dom:
dom.append(i)
# Download videos
for j in dom:
url = j
num += 1
response = requests.get(url, stream=True)
filename = str(num) + '.mp4'
with open('F:\video\' + filename, 'ab+') as f:
f.write(response.content)
f.flush()
print(filename + ' download completed')
Finally successfully obtained all the videos of the beautiful girls…
If I had this skill back in the factory, how great it would have been, haha.
Actually, I think more, most girls like to shoot Douyin, but they probably wouldn’t download such operations.
So here’s the opportunity for you young lads, download the Douyin videos of the girls you like.
Then edit a series of videos called “Most Beautiful Moments“, isn’t that an opportunity…
/ 04 / Conclusion
The code is all available on GitHub. https://github.com/Tobby-star/douyin
Author of this article
Fanastr: Python enthusiast, focused on web scraping, data analysis, and visualization
Recommended Reading
(Click the title to jump to read)
So arrogant! He actually used Python to bypass the “captcha”
I used Python and Twilio to automate course selection
ShutIt: A Python-based shell automation framework
Do you think this article is helpful? Please share it with more people
Follow “Python Developers” and add a star to enhance Python skills