Crawling Zhihu’s ‘God Replies’ with Python: Can’t Stop Laughing

Zhihu often features many amusing “God replies” that are initially astonishing and leave a lasting impression upon reflection. This article will introduce how to crawl Zhihu’s God replies and reveal the principles behind it.

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

What characteristics do Zhihu’s God replies have? Let’s observe the following images:

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

Can you see any patterns? Are they concise and insightful? Do they have a lot of upvotes? Therefore, to crawl Zhihu’s God replies, we only need to collect those answers that have many upvotes and are short in length.

It can be achieved in two simple steps: first, crawl Zhihu answers, and second, filter the answers. Isn’t it easy?

Crawling Zhihu Answers

First, we crawl the answers on Zhihu. There are too many answers on Zhihu, and crawling all of them at once would be time-consuming. We can select a few topics and crawl the content within those topics.

The following function is used to crawl the content of a specified topic:

def get_answers_by_page(topic_id, page_no):
    offset = page_no * 10
    url = <topic_url> # topic_url is the URL corresponding to this topic
    headers = {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36",
    }
    r = requests.get(url, verify=False, headers=headers)
    content = r.content.decode("utf-8")
    data = json.loads(content)
    is_end = data["paging"]["is_end"]
    items = data["data"]
    client = pymongo.MongoClient()
    db = client["zhihu"]
    if len(items) > 0:
        db.answers.insert_many(items)
        db.saved_topics.insert({"topic_id": topic_id, "page_no": page_no})
    return is_end

The get_answers_by_page function has two parameters: the first parameter is the topic’s ID, and the second parameter indicates which page’s content is being crawled.

There are several fields in the crawled content that need attention, highlighted in yellow in the image below:

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

The meanings of these fields are as follows:

  • question.title:The title of the question.

  • content:The content of the answer.

  • voteup_count:The number of upvotes.

These fields will be used in the next step to filter answers.

Filtering Answers

After crawling the data, let’s filter the results. We will use MongoDB’s aggregation pipeline to filter the answers.

For more information on using MongoDB’s aggregation pipeline, refer to the article Aggregation Pipeline Quick Reference:

https://docs.mongodb.com/manual/meta/aggregation-quick-reference/

The code is as follows:

client = pymongo.MongoClient()

db = client["zhihu"]
items = db.answers.aggregate([
    {"$match": {"target.type": "answer"}},
    {"$match": {"target.voteup_count": {"$gte": 1000}}},
    {"$addFields": {"answer_len": {"$strLenCP": "$target.content"}}},
    {"$match": {"answer_len": {"$lte": 50}}},])

The above code filters all answers with more than 1000 upvotes and less than 50 characters, resulting in concise and insightful God replies.

This is the core code, and the complete code has been uploaded to GitHub:

https://github.com/pythonml/answer

Zhihu God Replies

Now that the code is complete, let’s run it and see the results. Since most followers are programmers, we will filter for God replies related to programmers. The results are as follows, with nearly a hundred humorous jokes 😂

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

PS: The jokes are merely a display of the crawler’s results for everyone’s amusement! Feel free to discuss which joke is the funniest in the comments!

1

Q:What are the most common “lies” programmers say?

A://TODO

2

Q:What are the catchphrases of computer science students?

A:My computer is running just fine…

3

Q:How to refute the view that “programmers are useless when away from the computer”?

A:No, many programmers are useless even in front of the computer.

4

Q:If one day everyone spoke in programming languages, what would that scene look like?

A:hello, world. 烫烫烫烫烫烫烫�d}��R�0:�v�?.

5

Q:I suddenly want to open a restaurant themed around programmers, called “Programmer’s Dish”, with dish names as keywords from various programming languages. Any advice on its prospects?

A:At the entrance, a big “hello world”, and the signature dish called “Braised Product Manager” will definitely be packed.

6

Q:What is recursion?

A:The definition and scope of “political content not suitable for public discussion” also belong to “political content not suitable for public discussion”.

7

Q:How should the basic programming term “Bug” be translated?

A:Little trouble, your program has encountered little trouble again.

8

Q:What is the joy of programming?

A:A person’s sense of achievement comes from two things: creation and destruction.

9

Q:What classic rumors exist in the computer world?

A:I have read and agree to the terms.

10

Q:As a programmer, what mathematical pitfalls have you encountered while coding?

A:When reading papers, one “obviously” took me an entire afternoon.

11

Q:What equipment do wealthy programmers have?

A:A girlfriend…

12

Q:Which deity should I pray to for code to not have bugs?

A:Pray to Yongzheng, specialized in treating the Eighth Prince.

13

Q:Is getting into a good university to study IT the only way for poor families in China to rise to the middle class?

A:Yes, there are four paths: writing code, doing finance, doing finance in the coding circle, and writing code in the finance circle.

14

Q:Why do programmers like to carry computer bags everywhere, even when they don’t have a computer inside?

A:Because they have no other bags.

15

Q:How to translate “Talk is cheap. Show me the code” effectively?

A:Stop talking nonsense, show me the code.

16

Q:Why do programmers’ girlfriends or wives generally have a much higher appearance level than the men?

A:I’m impressed by the looks of programmers’ girlfriends; if you ask ten programmers who their girlfriend is, nine will answer “Aragaki Yui”.

17

Q:Why do some people prefer to buy several mechanical keyboards to switch between them rather than using a face mask?

A:I don’t rely on looks to make a living. I rely on the money I earn through hard work. I can spend it however I want.

18

Q:What should be engraved on wedding rings for programmer couples?

A:0 error 0 warning.

19

Q:Do IT engineers feel uncomfortable when called “code farmers”?

A:At least we are still human; products and designs are already dogs…

20

Q:Why would a 30-year-old male salesperson invite a 24-year-old male programmer to Starbucks near the community?

A:Based on my years of experience, he probably has an amazing idea that just needs a programmer to implement it.

21

Q:How to find a girlfriend who likes programmers?

A:It depends on fate; there are so many users on Zhihu, and if you notice me, that’s fate.

22

Q:How do programmer girlfriends celebrate their boyfriends’ birthdays?

A:Tell him the interface is ready.

23

Q:As a programmer, how do you find a girlfriend after work?

A:It’s already rare for the questioner to be a programmer and still like girls.

24

Q:What preparations are needed for a programmer to switch to barbecue, and what are the advantages and disadvantages?

A:You see, you don’t even know the advantages and disadvantages of doing barbecue, so you still need a product manager.

25

Q:What can provoke programmers?

A:When passing by their computer, say, “Oh, writing bugs again!”

26

Q:My teacher said that Java is suitable for large software while C# is suitable for small and medium-sized software. Is this true?

A:Java has a talent for turning small and medium-sized software into large software.

27

Q:Why were programmers’ salaries so high in 2014?

A:The hourly wage is not high.

28

Q:Isn’t it true that most programmers complain about low salaries?

A:Who, who complains about high salaries?

29

Q:What should a single programmer do after solving a technical problem without a girl to show off to?

A:Now you understand why so many programmers write technical blogs.

30

Q:Do Chinese programmers prefer wearing “tactical jackets + jeans + sneakers”? If so, why has this trend formed?

A:Do you think they dress well for programmers to see?

31

Q:As an IT professional, what tools have greatly improved your work efficiency?

A:Being single.

32

Q:Why do I think programmers seem to be poor communicators?

A:Just consider us as having low emotional intelligence; that way, you are happy, and we are happy.

33

Q:In China, the oldest programmers are only around 40 years old. What can Chinese programmers do in the future?

A:This is the same principle as why no one from the 90s has lived past 30.

34

Q:How to respond to a programmer’s text: “Hello world”?

A:hello nerd.

35

Q:How can you tell if an IT guy likes a girl?

A:When he tries hard to get close to you despite his usual quiet demeanor.

36

Q:Will wired mice be replaced by wireless mice?

A:I think wired mice won’t be replaced in internet cafes.

37

Q:How to make someone realize that their C++ skills are not the best in China?

A:To be honest, I’m not pretending: my C++ skills are ranked 0 in the country.

38

Q:Why do all icons shake when deleting software on the iPhone?

A:Third-party software is scared, while system software is showing off.

39

Q:What classic rumors exist in the computer world?

A:Windows is looking for a solution online.

40

Q:With the current pace of iPhone processor performance doubling every year, will it soon catch up to or even surpass desktop processors?

A:When I was young, I always thought that in a couple of years, I would be as big as my two-year-older brother.

41

Q:What is the minimum benefit Zhihu has brought you?

A:It kills time without feeling guilty.

42

Q:What inhumane technological inventions or designs exist?

A:The computer can’t connect to the internet, and after diagnosing, it suggests I connect to the internet to solve it.

43

Q:Why are designers reluctant to be called graphic designers?

A:As long as the salary is high, you can call me anything.

44

Q:Why do some people think NetEase Cloud Music is the industry’s conscience?

A:One day, it suddenly pushed a message saying it found the lyrics I wanted.

45

Q:Why haven’t self-destructing drone attack weapons appeared? Have terrorists used them?

A:Are you talking about missiles?

46

Q:Since thoughts are mine, why can’t I control my negative emotions sometimes?

A:The operating system does not allow users to access, modify, or delete core system files, as this would damage the system and cause it to malfunction.

47

Q:Although Lu Xun is impressive, is he just a filler among the top ten literary masters in the world?

A:Why should literary masters pay for the rankings set by illiterates?

48

Q:What technologies are close to a bottleneck and have not seen significant breakthroughs for a long time?

A:Boiling water.

49

Q:How do you view some people’s preference for downloading software from official websites?

A:Have you never been through the Baidu family bucket?

50

Q:Why do many people buy laptops for gaming instead of using more powerful desktops?

A:Because they can’t afford a house…

51

Q:How shocking was the first time you heard good headphones?

A:The first time I heard good headphones wasn’t shocking, but when I switched back to regular headphones, the shock came.

52

Q:Is Chrome really power-hungry?

A:Not power-hungry; I’m using Chrome right now, and after all this time, my laptop still has 50% battery left.

53

Q:What is the experience of installing Windows on a MacBook?

A:It feels like suddenly having a soft spot while losing armor.

54

Q:What is it like to have all Apple products at home?

A:When one phone rings, the whole family rings.

55

Q:Why don’t you buy an iPhone X?

A:The contradiction between the growing needs for a better life and the reality of poverty.

56

Q:Why are some willing to spend thousands on an iPhone but not willing to spend tens on legitimate iPhone software and games?

A:Because they can’t download iPhone.

57

Q:What apps have particularly stunning names?

A:Water Meter Assistant… it’s for checking express deliveries…

58

Q:Why do you need to buy an external hard drive?

A:When conditions improve, I also want to provide a comfortable living for my women.

59

Q:How to use an iPad to remotely shut down a PC?

A:Aim for the PC’s power button and throw it.

60

Q:How do you evaluate the Apple conference on September 7, 2016?

A:For the new MacBook Pro, I watched three conferences in six months…

61

Q:How do you evaluate Internet Explorer?

A:It’s a browser for downloading other browsers.

—–A year later—–

IE8 and below are terrible; it’s a crying rhythm for front-end development.

62

Q:My parents want me to save money to buy a house, but I want to buy an Apple computer. What should I do?

A:If you can really save 500,000 for a house in three years, does it matter if you spend 17,000 on a computer, big brother?

63

Q:What are some useless mobile apps?

A:SMS interception software! After intercepting, it tells you it intercepted a message. I believe 99% of people will click to see the intercepted message!

64

Q:What is the most headache-inducing part of making a complete PPT?

A:How to hide my abilities from my boss.

65

Q:What can Vim do that Emacs cannot?

A:Help the poor children of Uganda…

66

Q:Why do Apple users choose Apple?

A:Because users who don’t use Apple are not Apple users.

67

Q:Why shouldn’t programmers know how to fix computers?

A:Does Fan Bingbing need to know how to fix TVs?

68

Q:What is the item you have carried or worn the longest? What special significance does it have for you?

A:Glasses, because I’m blind.

69

Q:Is appearance really that important?

A:It’s important; generally, when a girl acts cute, I have to rely on force to solve the problem.

70

Q:Why are some people so sentimental at night?

A:When there are fewer images to render, the CPU has free time to ponder life.

71

Q:What was the reason for your first job change? Do you regret it?

A:I work for money; he insists on talking about ideals, but my ideal is not to work…

72

Q:How should you respond when asked in an interview why you didn’t go to Tsinghua University?

A:“I went, but the security guard didn’t let me in.”

73

Q:Will I be blacklisted for breaking a contract after accepting an offer from Alibaba?

A:Make a call and let the other party embrace change.

74

Q:How can boys dress in a way that gives off a two-dimensional feeling while still looking decent and not weird?

A:Just wear women’s clothing.

75

Q:What are some interesting short jokes in the IT field?

A:winrarsetup.rar.

76

Q:What should I do if I usually joke around, but when confessing to a girl, she thinks I’m joking?

A:She didn’t misunderstand; she’s just kind-hearted.

77

Q:How to gracefully reject a girl’s confession?

A:I’m aware; you should wait for news.

78

Q:Where can I meet high-quality boys more often?

A:You can stroll around during the Two Sessions; all the impressive uncles are there.

79

Q:Why do so many people say they are lonely, want to find a boyfriend/girlfriend, yet remain single?

A:Because not only are they ugly, but they also think others are ugly.

80

Q:What does it feel like to be a backup?

A:Every word is a password.

81

Q:What should I do if I can’t remove my makeup in time before rolling in bed?

A:Don’t, I want Avatar.

82

Q:What happens if a boy insists on saying goodnight to a girl every day for a year?

A:IE asks me every day if I want to set it as the default browser; it’s been years.

83

Q:What does it mean when a girl gives a boy a frog toy?

A:I give you a frog; you should at least return me some tadpoles.

84

Q:Is a girl’s appearance really that important?

A:It’s important; generally, when a girl acts cute, I have to rely on force to solve the problem.

85

Q:Why are you still single?

A:Because being single allows for more focus; being single leads to progress~

86

Q:Why do most girls hope to be friends after rejecting a boy’s confession? What is this mentality?

A:Just being polite; otherwise, what can I say, “We are not suitable; let’s be enemies instead”?

87

Q:How to confess using lyrics?

A:I always want to confess to you; my feelings are so grand.

88

Q:What do you think when your confession is rejected?

A:As expected of the girl I like; her taste is indeed good~

89

Q:If a girl helps me, how can I thank her without her boyfriend misunderstanding?

A:You can give her a banner…

Source: Python and Data Analysis (ID: PythonML) WeChat public account, you can follow by reading the original text.

Leave a Comment for Rewards Activity

Why do programmers often not have girlfriends? Feel free to share your “God comments” in the comments section. The editor will select the three most valuable comments to receive 50, 30, and 20 yuan red envelope rewards. The deadline for the activity is November 16 at 12:00.

Crawling Zhihu's 'God Replies' with Python: Can't Stop Laughing

Recommended Articles:

Optimization of Nginx under a million concurrent connections, this article is enough!

Please, don’t ask me about the underlying principles of Spring Cloud in interviews anymore.

Is it technically feasible to pull 1.4 billion Chinese people into a WeChat group?

Leave a Comment