The wait

These days I often find myself waiting often. Mostly for deliveries of various things that I have ordered days, weeks, sometimes months ago.

I receive these packages—of household basics like soap refills or impulse buys like the white sneakers I saw someone wearing at the neighborhood CityMD—like messages from the outside world. A reminder that it still exists.

The wait is not active, though I might remember and wonder whether the running shoes I ordered have shipped, and how soon I might expect them. While most items arrive within about a week, there’s always some items that are backordered, or shipping from a vendor in South Korea, and it’s not clear when and if they will arrive.

The only principled thing I did in the last year or so is cancel my Amazon Prime account. I don’t boycott Amazon entirely—I’m not that principled, and plus I don’t know who exactly I’d be hurting if I did that—but giving up on two-day delivery is a small sacrifice to make.

I’ve learned that there is almost nothing I need within a two-day window, and if I do, I can get it at the bodega for an extra dollar. And without that incentive, I can order from any other site that offers free shipping.

Now, because of the pandemic and the supply-chain mishaps and all that, I’m always waiting for something to arrive. I like it. I like separating the buying from the arrival, and surrendering to forces that are mysterious to me.

Lemmatizing Korean in R for language learning

I must begin with a disclaimer that I am absolutely not a coder, or anything of the sort, just someone obsessed with language and vey rudimentary R skills.

The truth is that I’ve been certain for a long time that there’s some NLP magic out there that can be really helpful in language learning, especially in figuring out the most efficient way to approach a new text in a foreign language.

It’s a super practical question. In Korean, where I’m an absolute beginner, I’m most interested in finding the most frequent words in very short texts, or in kdrama episodes I’m about to watch. In Spanish, where I have a basic command of the language, I’m looking for the most topical words, or two-to-three word phrases, in a book like Harry Potter.

I recently discovered LingQ and Learning with Texts, both very great tools that keep a library of words or phrases you know, so that when you input a new text, you can immediately see what you don’t know.

But it quickly became apparent that those tools don’t help with a most basic problem: that words take on a zillion forms. If you know the word “eat,” you know the words “eating”,”eats” and “eaten.” Things get way crazier in a language like Korean where particles get added directly to words; verbs have basically infinite forms thanks to verb endings and politeness levels, and there’s a wonderful tendency to bric-a-brac all of the above. It’s fabulous and fascinating, but makes it impossible to use simple frequency to get a sense of the text.

So all that being said, my first goal was very simple: strip each word of a new text to its stem, and then use the frequency of the stems (or lemmas) to choose vocabulary words.

Luckily, I found the answer super easily using the udpipe package, which produces a table with tokenized text, creating a row for every word with associated data, including its lemmatized form, which looks as follows (plus way more columns):

TokenLemma
강한강하+ㄴ
공기가공기+가

Here’s the code I used to get from the raw text to a list of vocab words to accompany my study of the text:

install.packages("udpipe")

library(udpipe)
library(tidyverse)

#load text and remove punctuation
txt <- readLines("my_text.txt")
txt <- gsub('[[:punct:] ]+',' ',txt)

#apply udpipe to text
tokens <- udpipe(txt, object = "korean") 

#separate lemma column by '+' to produce a new column of stems
tokens <- tokens %>%  separate(lemma, into = c("lemma1","lemma2"), 
            sep="\\+", extra = "merge", fill = "right") 

#find top stems
top <- tokens %>%
  count(lemma1) %>%
  arrange(desc(n)) %>%
  top_n(10)

#merge back into tokens list to find vocal words
vocab <- top %>%
  left_join(tokens, by="lemma1") %>%
  select(token, sentence, lemma1, lemma2, upos, xpos)

One thing to note is that you’ll need to remove ‘stopwords,’ aka very common words in your language/text, so that your top words/lemmas aren’t ones you already know. (For example, I ran this on the transcript of an episode of the kdrama Flower of Evil. Out of 3,119 words in the transcript, 369 were associated with the 10 most common roots, including 있다, 없다, 아니다, 하다, 나, 그, and 것.) I downloaded the stopwords list provided here and then filtered those out.

You can use the same script as above, just download the list and add this:

stopwords <- "stopwords-ko.txt"

top <- tokens %>%
  filter(!token %in% stopwords) %>% #remove stopwords
  count(lemma1) %>%
  arrange(desc(n)) %>%
  top_n(10)

And that’s it.

Of course, there’s way more to all of this, first and foremost using the if-tdf algorithm to find words that are more common in your specific text than in a comparison corpus, suggesting they’re the most topical/useful to be familiar with. But to do that, you have to have something to compare your text against. But that’s for another time.

I wait for the bus

I was standing at a bus stop today. My phone was dead. So I started making sentences in Korean as a way to fill the time. 

저는 버스를 기다려요 – I wait for the bus

저는 버스를 기다리고 있어요 – I am waiting for the bus

발이 다져서 못 걷기 테문에 버스로 왔어요 – I came by bus because I hurt my foot and I can’t walk 

It took me a good few minutes to piece together that let sentence. I kept stopping and rearranging the pieces in my head: conjugating 다지다, ordering the sentence, deciding whether to use the subject or object particle for 발, applying the ㄷ irregular.

By the time I had constructed the sentence, the bus arrived, and I was pretty discouraged.

I still wasn’t sure if it was correct, and even if it was correct, if it was the most natural way to say it. And regardless, I realized that there was no way I would have made myself understood to someone else. Not only did it take me forever to arrive at the completed sentence—the person would have been long gone by then—but my pronunciation is atrocious so any mistakes would have thrown the listener off completely. 

I tried to be happy with what I was able to make the sentence at all, and to have more reasonable expectations. That sentence has three clauses, and about 6 or 7 grammatical principals in it at least!

But it made me wonder if I’m okay with progressing in my reading/writing much faster than listening/speaking, and potentially having that always be the case. I could very easily see a scenario in which I can read a Korean news story, but can’t even ask for teokkbokki without freezing. 

When I started studying Korean, I told myself I wasn’t trying to become fluent, because that’s ridiculously hard, and I have nowhere in my life where I can hear or speak Korean without significant effort. But now that I’ve put in the work for a few months, I wonder if I should embrace my read-only approach, or I should be more intentional about speaking. 

Because as much as being able to passively take in Korean culture is amazing, I wonder if it’s cowardly to engage in a language completely alone. I can improve my reading/writing/listening sitting alone in my apartment. But the only way to improve speaking is to have an audience, and that’s requires a whole different set of skills.

Anyway, I don’t have the answer yet.

생각해 볼거요.

Exploding heads

I had a frozen margarita in a darkened alley near Union Hall last night. My friend had a pink sangria, with frozen berries floating in it. It was quiet, just us on barstools at a table to ourselves, and another group a few tables over. We spoke quickly and intensely and without stopping for an hour, maybe more, until the server came by to announce last call, though it was only 10 p.m.

When we got up to leave, my friend asked if we should get another drink, and even though 16 ounces of margarita is enough to make my body feel weightless, I said yes. I wanted to be out in the streets of Brooklyn, amid whoever and whatever was left in this sultry city. So we walked up Fifth Avenue passing diners and revelers in the newly configured outdoor restaurants, lining the street, and spilling onto the pavements. We ordered beer indoors with masks on—”I’ll serve you,” the bearded bartender said when we asked if they were still open. He was the only human in the storefront, standing behind the long polished wood bar.

We took the beers to a table at the curb, in what once was a parking space and continued the conversation. This time my friend spoke earnestly about trying to become more accepting of his own feelings, and more mindful of his ego, and to try less to preempt things—to scan everyone else’s possible feelings and reactions before they even happen, in order to alleviate them—something that he and I have in common. He misunderstood the question that I’d asked, which was a more practical question about how he was going about his new efforts at discipline: getting abs, playing the banjo, eating healthier, though I’m glad he did.

I biked back around midnight, lightheaded and sweaty, even with the thick breeze and in a quick summer shower that petered out quickly.

I woke at 6am with a familiar weight in my forehead, still wearing the shorts I had worn the night before.

Thinking I was hungover I took Advil and tried to sleep it off, in denial that the alcohol had triggered a migraine: the flame of pain on the right side of my skull, the daze, the inability to mesh my perception with the reality around me, were obvious indicators. So I spent most of the day pretending to believe that coffee and Advil and a little bit of pushing myself was enough. I sent a quick email to my boss so he’d know I was alive, then ignored everything work related.

By early afternoon, the pain and nausea were bad enough that I went back to bed and curled up amid a mess of clothes, with an ice pack over my forehead. I tried reading on my phone, and when that wasn’t enough to distract me, I repeated like a mantra in my head,”I want to die, I want to die, I want to die.”

In the past, when the migraines were bad I’d imagine a gun shooting my head and the whole thing exploding. Not visually. Just the immense feeling of relief. The pain gone. Nothing left to hurt. Over and over, I’d imagine the gun, the shot, the explosion. Poof. No more pain.

This wasn’t about death, it was about relief. So I was surprised to hear the words in my head. The gun seemed more benign. By switching it to words, it remained similarly situational, but also it opened a door to thoughts I might not otherwise consider. What would it really be like if I were no longer here? If I died on this bed, surrounded by laundry and pita chips and blue Ikea bags?

In the nature of migraines, the thoughts didn’t go much further. They repeated on a loop with short bursts of what-ifs. Who would find me. How long would it take. What about the pita chips. What if I never felt this ever again. What if my head, and the pandemic, and injustice, and bylines, and subtweets, and love, and disappointment, and missing PPE, and the census and the postal service and birthdays and color-coded sticky notes and health insurance and the American experiment descending into authoritarianism, ceased. Exploded. Poof.

괜잖다

I’m on a Lee Jung-suk kick, which started when I watched W on Viki, then moved on to School 2013, then Doctor Stranger, which is awful, and now I’m in the midst of While You Were Sleeping, which apparently is super beloved, but I’m finding pretty boring.

I shouldn’t have liked School 2013 as much as I did because it’s a pretty unsophisticated high school drama, but I’m a sucker for teen dramas anyway, especially when the teen in question is Lee Jung-suk standing up to bullies, and being too cool for school as well a super nice guy. (I basically skipped all the teacher parts, because that really was boring.)

There’s a scene in the beginning of episode 2, where Lee Jung-suk’s character, Ko Nam-soon, arrives home from school. In the space of two days he’s been wrongly accused twice of wrongdoing by the school, and he’s been beaten up twice by the school bully. He’s also had to pick up his drunk, deadbeat Dad, and work a nighttime gig as a delivery boy.

When he arrives home, after being beaten for the second time, he enters his room, lies down on the bed, clearly in pain, and just repeats to himself a few times:

“괜잖다. 괜잖다. 괜잖다.”

“I’m okay. I’m okay. I’m okay.”

It makes your heart go out to him. And it’s actually one of the few times we see the toll that his life is taking on him, since he usually pulls off the super cool, nothing bothers me, attitude.

Besides for obviously pulling on your heartstrings, I personally liked the scene because of the word he uses. 괜잖다 is used so frequently in Korean (or at last Korean dramas)—mostly in the 괜잖아/요 form in dialogue (as in I’m okay/are you okay?/it’s okay)—that it actually started to mean that in my brain. And I love that feeling when a word in another language stops being a foreign symbol to memorize, but actually starts to mean the thing that it means, if that makes sense.

Also, he uses the word in the infinitive form, and I’m not sure why or if it conveys something different than if he’d conjugated it somehow. The subtitles translated it as “I’m okay” but I think it could just have easily been translated to “It’s okay” because there’s no subject, and It’s making me wonder if there’s actually a distinction to a Korean speaker. Obviously, he wasn’t translating in English in his mind, so what did he mean, did he mean “I’m okay” or “It’s okay” or would he not differentiate?

Adventures in Konglish

My Korean has advanced to the point that I can make out English proper nouns in Korean news, spoken in the wonderful transmorphic way that languages have of twisting themselves into something new, and what some people call Konglish.

Although perhaps in the case of proper nouns, that’s not actually Konglish? I mean, how do you say California if not 갤리포니아, which transliterates/romanizes to kel-li-po-ni-ah?

In any case, in the ongoing mystery of why I haven’t given up on Korean, I have now begun listening/watching video clips on MBC News, because I stumbled upon the fact that they provide the transcripts to 2-3 minute news clips on their site. Which is absolutely gold.

In the sea of illegible (to me) Korean that is the MBC homepage, I use the thumbnails to guide me, and try to find a clip that features Donald Trump, the coronavirus, or something that might give me a few clues as to what the clip is about.

The upside to this is that I now know how to say ‘confirmed cases’, ‘positive results’, and ‘rapidly increasing’ in Korean. The downside is that I die a little of humiliation/anger/sadness each time I listen to the clip.

Which in the grand scheme of things seems like an okay tradeoff?

Tbh my Korean is not good enough to understand any of it, but I input the text in Learning With Texts, translate some of the words, and then listen to it a few times, mostly making out words like Keliponia, Terompe, Korona, Hyuseton, and Pilorida.