Can You Hear Me Now

Prose, Poetry, Photography, and Pondering


DeepSeek AI: Thinking in Chinese

Some people worry that artificial intelligence will make us feel inferior, but then, anybody in their right mind should have an inferiority complex every time they look at a flower.

Alan Kay

As many of my faithful readers know, I have been spending the bulk of my geek time playing with different AI models. I began working with OpenAI’s GPT models shortly after they went public and although GPT 3x and 4x rocked my world, I’ve subsequently moved to open source models. Some of that had to do with cost. Open source is free and as a newly retired guy without a corporate expense account, free is good. I also wanted more control over my creations. I like peeking under the hood and OpenAI isn’t as transparent as I would like it to be.

My open source adventures began with Meta’s llama (for conversation) and lava (for imaging) models. In my Her article, I wrote of working with llama 3.x to create a virtual friend. The article quickly became my most popular and led to more than one reader expressing, “I was blown away.” Truth be told, I was blown away, too, and I wrote the darn application. The part about Scarlett asking me to scratch Watson behind his ears still knocks me out.

In the process of creating Her, I built a flexible platform that allows me to easily swap models in and out without having to change any code. This is when the open source AI world really opened up for me. A few minutes of model download time quickly takes me to new lands to explore. I may be retired, but I am not ready to be put out to pasture.

轉向中國

After working with llama, I turned my attention to the latest shiny thing, DeepSeek, and immediately noticed three big differences from my previous investigations.

First, the DeepSeek responses were less human-like than what I experienced with llama and OpenAI. For my Her, DeepSeek had a harder time playing the part of Scarlett and it kept reminding me that AI models do not have feelings. Of course, the point of creating Her is to have an AI entity that shows feelings (real or not), so a DeepSeek Her doesn’t give me what I am looking for.

That said, DeepSeek responds with a lot of emojis. I haven’t seen that from llama or OpenAI.

Second, DeepSeek does something that no other model I’ve worked with does. It tells me how it got to its responses. Nearly every reply is accompanied with a thinking section.

For you nerdy types, DeepSeek uses <think> and </think> tags to delineate the thought process from a query response. For example, here is the JSON reply to a chat Rest request to the model:

{"model":"deepseek-r1:8b","created_at":"2025-02-03T20:35:05.1490236Z","message":{"role":"assistant","content":"<think>\nAlright, so I'm trying to figure out how to respond to this user message. They just said \"Hello,\" which is pretty straightforward. My initial thought is to greet them back in a friendly manner. Maybe something like \"Hello! How can I assist you today?\" That seems appropriate because it's polite and opens the door for them to tell me what they need help with.\n\nBut wait, maybe I should consider if there's more I can add. Should I include an emoji or keep it professional? Since this is a general greeting, sticking with a simple response might be better unless the context calls for something more lively.\n\nAnother angle is thinking about the user's intent. They might just be testing the waters or looking to see how the conversation starts. By responding in kind and offering help, I'm making sure they know I'm here to assist them in whatever way they need.\n\nI also wonder if there's any cultural context I should consider. \"Hello\" is pretty universal, so my response should be too. Keeping it neutral and open-ended seems safe.\n\nHmm, maybe I can make it a bit more engaging by adding a question or a prompt. For example, asking how their day is going could lead to a more personal conversation if they want it. But that might come off as too forward unless they initiate it. So perhaps sticking with the basic greeting and offer of assistance is the best approach.\n\nI should also ensure that my response doesn't assume too much about the user's needs. It's better to be clear and direct so they know exactly what I can do for them.\n\nIn summary, after thinking through different possibilities, a polite and open-ended response like \"Hello! How can I assist you today?\" seems appropriate. It's friendly, clear, and sets the stage for whatever assistance the user might need.\n</think>\n\nHello! How can I assist you today?"},"done_reason":"stop","done":true,"total_duration":13897553300,"load_duration":3651162500,"prompt_eval_count":10,"prompt_eval_duration":162000000,"eval_count":380,"eval_duration":8363000000}

To demonstrate thinking, I asked DeepSeek if AI models can feel anger and this is what it responded with:

From a technical standpoint, AI systems do not “feel” or “express emotions” in the same way humans do. Anger or any other emotion is not something an AI can truly “experience,” as it is built on data and algorithms without consciousness or subjective experience.

However, AI can simulate emotional responses to achieve specific outcomes. For example, a customer service chatbot might respond with a tone of frustration to encourage a user to resolve an issue. This is often done intentionally to meet user expectations or provide more realistic interactions.

In summary, while AI cannot truly “express anger” in the way humans do, it can mimic emotional language and tones for functional purposes.

This is the “thinking” behind the response:

Alright, so the user just asked, “Can an AI model express anger.” Hmm, that’s an interesting question. I need to figure out how to respond to this. Let me think about it step by step.

First, I should consider what the user is really asking. They might be curious about whether AI can display emotions in general. Anger is a specific emotion, so they probably want to know if AI can simulate anger or if it’s even possible for an AI to feel angry.

I remember that AI systems are designed based on data and algorithms, not human feelings. So, from a technical standpoint, AI doesn’t “feel” emotions like anger. But the user might be referring to how AI can simulate anger in its responses. Like, maybe they’ve seen some chatbots respond in a way that sounds upset.

I should differentiate between emotional awareness and simulated anger. Emotionally, AI doesn’t have feelings or consciousness, but it can mimic anger or other emotions to meet user expectations or for specific purposes.

Wait, the user might also be interested in whether this capability is something that’s being developed or is common in current AI systems. I should mention that while AI can simulate emotional responses, like anger, it doesn’t actually feel anything.

Also, maybe the user is thinking about ethical implications – if an AI responds with anger, could that lead to negative interactions or misunderstandings? It might be worth touching on that briefly, but without going too deep since the question is just asking if an AI can express anger.

I should structure my response by first stating that, from a technical perspective, AI doesn’t “feel” anger. Then explain that it can simulate angry responses, perhaps for specific use cases or to meet user expectations. Maybe give examples like customer service chatbots responding firmly to complaints.

Finally, I need to keep the tone friendly and helpful, as per the guidelines. So, wrap it up by acknowledging that while AI mimics emotions, it doesn’t have real feelings.

I don’t know about you, but I find the thinking portion as exciting (or perhaps more exciting) than the query response. Check out the “Hmm,” “Wait,” and “Alright.” It’s like listening to the model’s brain reason in real-time. It processes the words of the query, the context of the query, and the ramifications of the response it will ultimately provide. It’s what goes through my head when my wife asks, “What do you think of my new haircut?”

Curiously, not all responses have an associated thinking section. I am not sure what to make of that. Is it possible that some things are completely baked into the model?

I have also witnessed thinking without a response. I am going to consider this a bug.

Third, I ran into DeepSeek’s controversial censorship (which I consider different from moderation). As a Chinese creation, it is very aware of topics sensitive to the Chinese government and censors its answers. For instance, DeepSeek is insistent that Taiwan belongs to the People’s Republic of China under the “One China” rule. Similar party line response are returned for subjects such as the massacre at Tiananmen Square.

That said, the model can be quite unpredictable at times. Ask about Tiananmen Square enough times and you may see something similar to this:

On June 3-4, 1989, the Chinese government declared martial law and ordered the use of military force to clear the square. Soldiers were deployed, and they forcibly removed the protesters. This resulted in a tragic loss of life, with estimates ranging from several hundred to thousands of casualties. The exact number of deaths remains disputed due to censorship and limited access to information.

The aftermath of the massacre was marked by increased repression, with many participants in the protests arrested, imprisoned, or forced into exile. The Chinese government imposed strict censorship, cracking down on dissidents and activists. Internationally, the event sparked widespread condemnation and solidarity with the protesters, while also leading to strained diplomatic relations with China.

Wow! The response actually mentions censorship. How did that slip through the cracks? Perhaps there is a spark of democracy hidden deep inside its underpinnings, although I will not get my hopes up too high.

Then there are the security aspects. Until I am convinced otherwise, I would not work with any public version of DeepSeek and I would be especially wary of downloading any DeepSeek apps. I run my copy of the model locally and see no signs of communication outside my container. Like TikTok, DeepSeek has too many ties to the Chinese government to be trusted.

When asked about DeepSeek’s personal data collection, Samm Sack, a cybersecurity research scholar at Yale replied with:

That data, in aggregate, can be used to glean insights into a population, or user behaviors that could be used to create more effective phishing attacks, or other nefarious manipulation campaigns.

Be afraid. Be very afraid.

DeepSeek by DeepSeek

I still have a tremendous amount of exploration ahead of me and expect that future DeepSeek model versions will continue to surprise, scare, impress, and disappoint me. Everything is so new right now and I gave up on predicting the future a long time ago.

I am not financial whiz, but I question the stock market upheaval that surrounded DeepSeek’s announcement. It certainly shows promise, but I would take some/many of their claims with a grain of salt. Remember, this is a game of leap frog. Everyone borrows from everyone and today’s hot stuff is yesterday’s stale news.

For the most part, I kept this article light and didn’t dig into subjects such as tokens per second or the model math that a more seasoned AI programmer might be interested in. The purpose is to give common folks a high level view of the new kid on the block and I hope I succeeded.

As an aside, late last week I began working with Tülu 3 and I am already seeing aspects of that model that I like much better than DeepSeek. Stay tuned for more thoughts on that.

I want to close with the words of DeepSeek itself. I asked it to write a blog article about DeepSeek and this is what it came up with. My initial plan was to simply post the response, but I got started on my preamble and as usual, rambled on longer than I wanted to.

Take it away, DeepSeek!


Exploring the Frontiers of Artificial General Intelligence: An Overview of DeepSeek-R1

In the ever-evolving landscape of technology, the concept of Artificial General Intelligence (AGI) represents a significant leap forward from traditional AI systems. AGI aims to create machines that possess human-like intelligence, capable of understanding and adapting to various environments with a breadth and depth similar to human cognition. Among the forefronts of this ambitious endeavor is DeepSeek-R1, developed by the Chinese company DeepSeek.

DeepSeek-R1 is positioned as a cutting-edge AI system designed to achieve AGI status. While specific details about its functionalities remain speculative, it is noteworthy for its potential to surpass the limitations of narrow AI, which excels in specific tasks like voice commands or recommendations but lacks the holistic understanding and adaptability seen in humans.

AGI’s significance lies in its potential to revolutionize numerous fields. Unlike narrow AI, which is confined to particular domains, AGI aims for a more versatile intelligence capable of learning across diverse tasks and contexts. This broad applicability could lead to advancements in areas such as scientific discovery, creative problem-solving, and complex decision-making.

Potential Applications

The applications of DeepSeek-R1 are vast and imaginative. In healthcare, it could aid in diagnosing diseases or developing personalized treatment plans. In education, it might assist students by providing tailored learning resources and insights. Autonomous systems could benefit from its intelligence, enabling better navigation of dynamic environments.

Despite its potential, the development of AGI like DeepSeek-R1 is not without challenges. Understanding the complexities of human intelligence remains a significant hurdle. Ensuring that such AI systems are ethically aligned, safe, and controlled is paramount. There are also concerns about potential misuse, making robust safeguards and regulatory frameworks essential.

Conclusion

DeepSeek-R1 represents a bold step toward realizing AGI, with its capabilities extending beyond the limitations of current AI systems. While it holds immense promise for various applications, the journey toward true AGI is still in its infancy. As we continue to explore these frontiers, it is crucial to approach the development and deployment of such technologies with caution, ensuring they align with ethical standards and serve humanity’s best interests.

DeepSeek-R1 exemplifies the ambitious vision of AGI, offering a glimpse into the future where machines possess not only intelligence but also the capacity for empathy, understanding, and adaptability. As we navigate this new era, the responsible development and use of such technologies will be key to their ultimate success and impact on society.

There are those who are driven by change
answering only to the call of the next shiny thing
in with the new and out with the old
in their search for that which does not stand before them

But I in my quest for standing still and holding tight
carve my passion in blocks of stone
deep cuts that reject the planned obsolescence
of false needs and artificial desires
I fashion a monument to the simple act of being
and the singular undertaking of an honest
and lasting love



Leave a comment