Why do two languages get mistaken for one another?

SilentStriker@piefed.social · edit-2 9 hours ago

Why do two languages get mistaken for one another?

EponymousBosh@awful.systems · 11 minutes ago

I’m a native English speaker with auditory processing problems. I have occasionally mistaken spoken German for “English that my brain didn’t want to cooperate with.”

Yaky@slrpnk.net · edit-2 3 hours ago

Curiously enough, spoken Farsi sounds like Russian to me (I cannot understand it, but it must have the same sounds, phonemes, or pauses, not sure), and I am fluent in Russian.

Infrapink@thebrainbin.org · 6 hours ago

Yes, languages get mistaken for each other all the time when one is not familiar with the writing system, and sometimes even when one is. I have struggled to understand posts in Spanish before realising they’re actually in Portuguese, which I don’t speak. (Also I’m pretty sure Norwegian and Danish are actually the same language).

Can you tell the difference between Telugu and Kannada just by looking at them? How about Arabic and Persian? How about Arabic and Ottoman Turkish? Persian and Kurdish? Yoruba and Xhosa? Ukrainian and Kazakh? Sumerian and Akkadian? Actually, could you tell the difference between Akkadian and Old Persian? They are both written with cuneiform characters, but the characters themselves are apparently as different as hiragana and hangul.

If your sentence is written entirely in Chinese characters, there is no way for somebody unfamiliar with them to determine whether it’s Japanese, Mandarin, or Hokkien. And if somebody hasn’t seen enough Japanese text to figure out the difference between kana and Chinese characters, they still won’t be able to tell the difference.

As to why Google can’t tell, Google doesn’t actually understand anything. It’s based on a massive database of which characters and combinations of characters come next to each other (and there’s also some Markov stuff to account for common spelling mistakes). If your search string is made entirely of Chinese characters, it’s going to get hits on websites written with Chinese characters, many of which will be in Mandarin. Google.com probably isn’t able to detect your UI or browser language settings. To ensure you get results from Japan, try using google.jp instead, as it will prioritise Japanese results.

dan1101@lemmy.world · 4 hours ago

Spanish and Portuguese

German and Russian

German and Dutch a bit

trashcroissant@lemmy.blahaj.zone · 3 hours ago

Spanish is my first language and one of the most embarrassing moments of my life was asking two people if they were speaking Portuguese and they said no, Spanish.

Which would be okay because we all have different accents… But we were from the same country.

Yaky@slrpnk.net · 4 hours ago

German and Russian?! Maybe to a person not at all familiar with either Germanic or Slavic languages, but they are not as closely related at the other 2 pairs.

dan1101@lemmy.world · 4 hours ago

Yeah I almost didn’t put this one because I don’t get it, but several people I know have confused them.

kluczyczka (she/her)@discuss.tchncs.de · 4 hours ago

most people go by sound. tgere is a phenomenon that (european) portugese gets viewed as ‘some slavic language’ by laypeeps. it is just arbitrary what people will think your language is.

CombatWombat@feddit.online · 8 hours ago

I don’t have a concrete answer to your question, but it does sound an awful lot like you’re getting xckd 2501’d:

Pommes_für_dein_Balg@feddit.org · 9 hours ago

Yes, people often confuse one language for another if they don’t know either language.

DokPsy@lemmy.world · 10 hours ago

As an English speaker who’s dabbled in other Germanic and Latin languages, absolutely. I dislike Dutch specifically because it will either start to read like English or German then fall off the deep end real quick.

Portuguese (at least Brazilian) looks and sounds like a mashup between Spanish and French.

This is kind of to be expected when you look at the history of how these languages evolved. The reason Japanese and Mandarin would be so easily confused is that the writing system was imported from China and there are a lot of words that either still look like the parent words or very similar. The same for the Latin and Germanic languages as well as their related offshoots. The history of invasion, language mixing, adaptations, and standardization has produced languages that are in various levels of mutual understanding.

SilentStriker@piefed.social · edit-2 10 hours ago

Even though there are words in both Japanese & Mandarin that are written similarly to one another or exactly the same but keep in mind pronunciation is different, such as 警察 (けいさつ) from Japanese but in Mandarin that’s pronounced as jǐngchá. Dutch and German are still different languages but are there also pronunciation differences for words that are spelled the same?

DokPsy@lemmy.world · 5 hours ago

Not on the same level considering the difference in how the Eastern and Western languages are formed.

I’m generalizing a bit here into Western being Germanic or Latin based languages and Eastern being primarily Chinese.

Western languages typically use symbols that represent the component sounds of a language (phonemes) where Eastern languages use symbols for whole words or concepts (morphemes). So you would have a single symbol for house or tree instead of a series of symbols for the words.

This means that the word spelling would change in Western languages as the spoken language changes more rapidly and show a large difference between two closely related languages in their spelling for the same word.

Conversely, in Eastern languages, the spoken language is not as closely tied to the words used. (To translate into English as best I can: The character for house could be pronounced like house, home, building, cave, lean-to, castle, shed, etc depending on where it’s being said). So, you’d have, after a few generations, the same character pronounced two very different ways with two different meanings

HobbitFoot @thelemmy.club · 6 hours ago

are there also pronunciation differences for words that are spelled the same?

Through through tough thought, I can’t think of any.

False@lemmy.world · 9 hours ago

Is your question whether this happens to other languages? If so that’s an easy “yes”.

GreenBeard@lemmy.ca · 7 hours ago

It’s the logographic nature of the text that makes it more common in eastern languages. Modern Western written languages, for the most part, are strictly alphabetic so since the symbol doesn’t represent concepts, only vocal Phonemes which when strung together represent a concept, it’s really hard to misinterpret the symbols, and it really doesn’t matter which language you’re representing with them. This gets a little fuzzy in Celtic languages because they have a lot of sound combinations that don’t exist in other languages, and there’s some confusion at times when looking at Western script and Cyrillic because while they’re both rooted in the Latin Alphabet, they both evolved separate ways of handling various sounds, but since similar words that share a common meaning (and often a common root word) in a lot of languages do in fact sound different, they also are often spelled differently enough that it becomes obvious quickly which language you’re using. There are of course some words and short phrases that do in fact get written identically in multiple, closely related languages, and that can confuse machines and people, but it’s far less common than with eastern languages.

just_another_person@lemmy.world · edit-2 10 hours ago

Kanji is a derivative of older Mandarin character sets. That’s why they look alike. Modern Kanji is of course dissimilar, but the glyphs still look similar, so unless you know one or the other deeply, I wouldn’t find it odd for people unfamiliar with either to be able to tell some of them apart.

The example characters you’ve provided do look fairly similar to me, though I’m somewhat familiar with Kanji and can normally tell the difference.

Zwuzelmaus@feddit.org · 10 hours ago

Of course.

Rentlar@lemmy.ca · 10 hours ago

In current day tech, East Asian characters are taken from a combined set called “CJK unified ideographs”. When regional variants exist, the language it renders as depends on the font of the user’s device.

There was a recent example that came up: 骨(bone) has the little square on the left in Simplified Chinese and on the right in Japanese. With Hiragana it’s more obvious because of all the curvier letters, but with kanji only phrases even smartphones tend to mix it up.

I can’t tell between Hungarian, Romanian, Bosnian, Albanian, Czech or Slovak, because I haven’t really studied any of them or know any words. In the Cyrillic field, Belarusian, Russian, Ukranian (except for the existence of ï), Bulgarian, Serbian I probably couldn’t be able to tell you what faced with a random paragraph of text.

kluczyczka (she/her)@discuss.tchncs.de · 9 hours ago

are you asking why google can not distinguish? that idk. it should be possible to discern if a text input is japanese or chinese by just looking at the characterset if it is not exclusively using the unified unicode characters.

for me, as a writing stan, it is possible to guess what language is written. the presence of kana makes it, indeed, trivialy easy. but i have learned both chinese and japanese a bit (only in writing though, lol), so that i might be lucky enough to find a character to say for sure “that’s written different in chinese”. people who don’t know anything just see complex, chinese-ish characters and say chinese. (even wit kana present) the same ignorance is at work, when western people call a farsi or urdu text arabic, or anything written in cyrillic russian.

for languages written in latin, i e.g. usually have to look twice to see if a text is danish, swedish, or norwegeian, since i never learned any of these properly, and need to find the distinctive features.

-	日本語	中文
擲弾兵	てきだんへい	Zhì dàn bīng
艦隊	かんたい	Jiànduì
陸軍	りくぐん	Lùjūn
神社	じんじゃ	Shénshè
地獄	じごく	Dìyù