Automated Downloading of Douyin Videos Using Python

(Add a star to Python developers to enhance Python skills)

Author: Fanastr (This article is contributed by the author, see the end for a brief introduction)

The reason for writing this article is mainly because I read another article.

“After using Python to scrape these beautiful goddesses on Douyin, I suddenly became a winner in life,” the article briefly describes how an engineer used Python + ADB + Tencent’s AI to follow over a thousand beautiful girls in one night.

This fully reflects the difference between college students and workers in the factory, and I must say ××× is impressive…

Once, I was also alone in that huge factory contemplating life, thinking about where I should go.

Automated Downloading of Douyin Videos Using Python

I remember that I also spent those torturous days by scrolling through Douyin.

It’s just that I wasn’t as skilled as the big guy above; I manually identified beautiful girls…

Even now, I have registered for Tencent’s AI account, but I still don’t know how to use it.

So let’s start with something simple, pre-follow, and then use Python to automate the downloading of street shooting videos!!!

/ 01 / Charles

Use Charles to find the video API interface, the specific operation is the same as the previous Dangdang case, so I won’t elaborate.

Automated Downloading of Douyin Videos Using Python

By sliding the Douyin App, you can obtain the video request information.

Through multiple experiments, I found that the links keep changing at the end, but the beginning of the links remains constant, namely “http://v1-dy” “http://v6-dy” and “http://v9-dy” remain unchanged.

So when writing the script, you can use this information as the link prefix.

/ 02 / mitmproxy

Utilize the mitmdump component in mitmproxy, to interface with the Python script and implement processing after listening with Python.

Automated Downloading of Douyin Videos Using Python

Here I only use the script to obtain the link, and I did not directly use the script to download the video.

Because I run the script in the folder where the mitmdump.exe file is located, the script cannot import the requests module.

I don’t want to deal with those annoying environment variables, so I only obtain the link.

Then go to download the video; the video links need to be deduplicated, as there may be duplicates.

The Python script is as follows.

def response(flow):
    urls = ['http://v1-dy', 'http://v3-dy', 'http://v6-dy', 'http://v9-dy']
    # Filter the URLs, only select video URLs
    for url in urls:
        if url in flow.request.url:
            print('抖音视频')
            with open('douyin.csv', 'a+', encoding='utf-8-sig') as f:
                f.write(flow.request.url + '\n')

/ 03 / Appium

Configure the Appium parameters for Douyin.

Automated Downloading of Douyin Videos Using Python

Click the blue button, and the phone will automatically launch the Douyin App!

Next, operate the phone, and then click the refresh button in Appium to obtain the element location code.

Through this practice, I found that Appium sometimes cannot accurately obtain the element’s location, which may be similar to the iframe pages on the web.

So for elements that cannot be found, I directly click on the screen position of the phone.

Since everyone’s phone screen size is different, this parameter will definitely change, so there are drawbacks and it cannot be universal.

{ Swipe left and right to switch images }

Automated Downloading of Douyin Videos Using PythonAutomated Downloading of Douyin Videos Using PythonAutomated Downloading of Douyin Videos Using Python

The general operation is as shown in the above images. The UP owner’s homepage image is missing, please imagine it yourself. The Python code is as follows.

import time
import random
from appium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from appium.webdriver.common.touch_action import TouchAction
from selenium.webdriver.support import expected_conditions as EC

def main():
    # Set driver configuration
    server = 'http://localhost:4723/wd/hub'
    desired_caps = {
        'platformName': 'Android',
        'deviceName': 'STF_AL00',
        'appPackage': 'com.ss.android.ugc.aweme',
        'appActivity': '.main.MainActivity',
        # Disable the phone's soft keyboard
        'unicodeKeyboard': True,
        'resetKeyboard': True
    }
    driver = webdriver.Remote(server, desired_caps)
    wait = WebDriverWait(driver, 60)
    # Agree to the user privacy agreement, click
    button_1 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/q6')))
    button_1.click()
    # Deny phone permission, click
    button_2 = wait.until(EC.presence_of_element_located((By.ID, 'com.android.packageinstaller:id/permission_deny_button')))
    button_2.click()
    # Deny location permission, click
    button_3 = wait.until(EC.presence_of_element_located((By.ID, 'com.android.packageinstaller:id/permission_deny_button')))
    button_3.click()
    time.sleep(2)
    # Swipe up to enter the Douyin video playback page
    TouchAction(driver).press(x=515, y=1200).move_to(x=515, y=1000).release().perform()
    # A longer delay is needed here because Douyin has guided operations and prompts, so wait a moment
    time.sleep(20)
    # Click on Douyin's "Like" to enter the login interface
    TouchAction(driver).press(x=950, y=800).release().perform()
    # Click password login
    button_4 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/afg')))
    button_4.click()
    # Enter account
    button_5 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/ab_')))
    button_5.send_keys('your account')
    # Enter password
    button_6 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/aes')))
    button_6.send_keys('your password')
    time.sleep(2)
    # Because the soft keyboard will pop up and block the login button, need to click to dismiss the soft keyboard
    TouchAction(driver).press(x=980, y=1850).release().perform()
    time.sleep(2)
    # Click the login button
    button_7 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/abb')))
    button_7.click()
    time.sleep(2)
    # Successfully logged in, enter the Douyin video interface, click the bottom title bar "Me"
    TouchAction(driver).press(x=990, y=1850).release().perform()
    # Enter personal homepage, click on the follow section
    button_8 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/a_7')))
    button_8.click()
    # Enter the follow section, click the second follow
    button_9 = wait.until(EC.presence_of_element_located((By.XPATH, '    /hierarchy/android.widget.FrameLayout/android.widget.LinearLayout/android.widget.FrameLayout/android.widget.RelativeLayout/android.widget.LinearLayout/android.widget.FrameLayout/android.view.ViewGroup/android.widget.LinearLayout/android.support.v7.widget.RecyclerView/android.widget.RelativeLayout[2]/android.widget.RelativeLayout[1]')))
    button_9.click()
    # Enter the UP owner's homepage, click the first video
    button_10 = wait.until(EC.presence_of_element_located((By.ID, 'com.ss.android.ugc.aweme:id/aqm')))
    button_10.click()
    # Continuously scroll down the page until the bottom
    while True:
        TouchAction(driver).press(x=515, y=1247).move_to(x=515, y=1026).release().perform()
        time.sleep(float(random.randint(5, 10)))

if __name__ == '__main__':
    main()

Download video code, need to deduplicate video links.

import pandas as pd
import requests
import os

num = 0
dom = []
folder_path = "F:/video/"
os.makedirs(folder_path)
df = pd.read_csv('douyin.csv', header=None, names=["url"])

# Deduplicate links and remove the video links obtained just after entering Douyin
for i in df['url'][2:]:
    if i not in dom:
        dom.append(i)

# Download videos
for j in dom:
    url = j
    num += 1
    response = requests.get(url, stream=True)
    filename = str(num) + '.mp4'
    with open('F:\video\' + filename, 'ab+') as f:
        f.write(response.content)
        f.flush()
        print(filename + ' download completed')

Finally successfully obtained all the videos of the beautiful girls…

Automated Downloading of Douyin Videos Using Python

If I had this skill back in the factory, how great it would have been, haha.

Actually, I think more, most girls like to shoot Douyin, but they probably wouldn’t download such operations.

So here’s the opportunity for you young lads, download the Douyin videos of the girls you like.

Then edit a series of videos called “Most Beautiful Moments“, isn’t that an opportunity…

/ 04 / Conclusion

The code is all available on GitHub. https://github.com/Tobby-star/douyin

Author of this article

Fanastr: Python enthusiast, focused on web scraping, data analysis, and visualization

Recommended Reading

(Click the title to jump to read)

So arrogant! He actually used Python to bypass the “captcha”

I used Python and Twilio to automate course selection

ShutIt: A Python-based shell automation framework

Do you think this article is helpful? Please share it with more people

Follow “Python Developers” and add a star to enhance Python skills

Automated Downloading of Douyin Videos Using Python

Leave a Comment