Cassidoo’s Interview question of the week | 446

Hello :waving_hand:

This week question is about finding the majority element in an array. Here it is.

Find the majority element in an array (one that appears more than n/2 times) in O(n) time and O(1) space without hashmaps. Hint: the Boyer-Moore Voting algorithm might help if you can’t figure this one out!

Example

> majorityElement([2, 2, 1, 1, 2, 2, 1, 2, 2])
2

> majorityElement([3, 3, 4, 2, 3, 3, 1])
3

Happy coding!

2 Likes
# Boyer-Moore voting algorithm
def majority_element(numbers)
  numbers
    .reduce([nil, 0]) { |(candidate, count), el|
      candidate = el if count.zero?
      [candidate, count + ((el == candidate) ? 1 : -1)]
    }
    .first
end
3 Likes
def majority_element(numbers)
  candidate = numbers.first

  numbers.reduce(0) do |vote_total, number|
    candidate = number if vote_total.zero?
    vote_total + (number == candidate ? 1 : -1)
  end

  candidate
end

One of those I’ve committed to memory (along with last week’s Kadane’s algorithm).

I will say that from my (admittedly somewhat lazy) benchmarks, the Ruby runtime tends to perform this better, even on arrays with thousands of elements.

def majority_element(numbers)
  midpoint = numbers.size / 2

  # because of the 'appears more than n/2 times' aspect, 
  # the answer will equal the median of all the numbers
  numbers.sort[midpoint]
end

But I also wouldn’t be surprised if there’s some more optimized way to write the Boyer-Moore voting algorithm in Ruby.

2 Likes

Here is what is probably the most obvious solution. It is the first one that crossed my mind, and after that I was so infected with the hash virus that I wasn’t able to figure out how to do it in O(1) space…

def majority_element(array)
  array.tally.max_by { |_key, value| value }.first
end

BTW, @JoshDev, your sort version is really nice & quite a clever one, but unfortunately, not so performant, I don’t think you can beat for/each Boyer–Moore majority vote algorithm version. :expressionless:

Here are the benchmarks — I made a kind of smallish framework for answers, tests, and benchmarks; RDoc is a web generator.

1 Like

Wow, it’s surprising to me that reduce incurs such an overhead as opposed to each. (I looked in your repo to see the each implementation.) Unless I’m reading the results wrong.

Thanks for the benchmarks!

1 Like

You are welcome, and thank you for solutions, otherwise there would be nothing to benchmark (which is fun by itself).

Well, if you check benchmarks for other questions, almost every time each solution is the most efficient one. I don’t know much of the Ruby internals, but each is probably extremely well optimized—and maybe it is fastest because it is quite simple, after all, you are just iterating over the array? Don’t know…

Hm…it is a much smaller difference with YJIT disabled:

# 1_000 randomly shuffled integers (499 unique ones)

josh_dev ... sort_midpoint_oneliner :    38955.8 i/s
boyer & moore ................ each :    28727.9 i/s - 1.36x  slower
josh_dev ................... reduce :    24049.8 i/s - 1.62x  slower
izkreny ..... tally_max_by_oneliner :    18527.6 i/s - 2.10x  slower
fpsvogel ........... reduce_chained :    12328.7 i/s - 3.16x  slower

# 1_000_000 randomly shuffled integers (499_999 unique ones)

boyer & moore ................ each :       29.8 i/s
josh_dev ................... reduce :       23.5 i/s - 1.27x  slower
josh_dev ... sort_midpoint_oneliner :       15.1 i/s - 1.97x  slower
izkreny ..... tally_max_by_oneliner :       14.7 i/s - 2.03x  slower
fpsvogel ........... reduce_chained :       13.7 i/s - 2.17x  slower

And this is with jruby (JIT is automatically enabled):

# jruby 10.0.3.0 (3.4.5) 2026-02-02 b0be2ab713 OpenJDK 64-Bit Server VM 21.0.10+7-Ubuntu-124.04 on 21.0.10+7-Ubuntu-124.04 +indy +jit [x86_64-linux]

# 1_000 randomly shuffled integers (499 unique ones)

boyer & moore ................ each :    65645.1 i/s
josh_dev ... sort_midpoint_oneliner :    39935.1 i/s - 1.64x  slower
josh_dev ................... reduce :    32431.0 i/s - 2.02x  slower
fpsvogel ........... reduce_chained :    30382.3 i/s - 2.16x  slower
izkreny ..... tally_max_by_oneliner :    22198.3 i/s - 2.96x  slower

# 1_000_000 randomly shuffled integers (499_999 unique ones)

boyer & moore ................ each :       29.9 i/s
josh_dev ................... reduce :       18.3 i/s - 1.64x  slower
fpsvogel ........... reduce_chained :       17.1 i/s - 1.75x  slower
josh_dev ... sort_midpoint_oneliner :        6.5 i/s - 4.58x  slower
izkreny ..... tally_max_by_oneliner :        3.7 i/s - 7.98x  slower

Looks like (Y)JIT is doing a little difference in order, but generally each is always the winner when you have a bigger pile of data…

P.S.
Another thing, it could be that I am doing something wrong inside the benchmark code itself, although I believe it is OK (for example, in cases where arrays are mutated I always clone them before passing to the benchmark)…

1 Like

Interesting, thanks for the YJIT comparison. To be clear, I wasn’t questioning your results, just expressing my surprise at them, as I so often do when my assumptions are knocked over by benchmarks!

1 Like

Thanks for the benchmarks @izkreny

Always interesting to see. I didn’t have YJIT enabled and was working with array sizes on the order of 1-10k elements, so that’s why I was seeing the #sort solution perform fairly well. And yeah, interesting to note the performance overhead of #reduce.

I did have another idea that I haven’t found the time to code up: using the optimized median search shown in this Geeks 4 Geeks article. They essentially show “quick select” (selecting a single element from a quicksort type approach) algorithm for finding the median in an array. I think there’s a possibility this performs fairly well on average. Because the target element must appear at least n/2 times in the array, there’s a strong chance of the target element being randomly selected as the pivot.

1 Like

@fpsvogel no worries, I often tend to over-explain myself! :face_with_peeking_eye:

But, at least I am confident that, despite a difference in results, it is just fine to benchmark with YJIT enabled, as everything is faster, and it is nice to get insight into what works way much better in that context.

@JoshDev you are welcome! Yeah, there is probably room for improvement for the “sort” version, although I believe it will most definitely never beat the “each” one…and to be fair, this benchmark with all elements except the prevalent one being unique is just making it harder for sort to shine. :sweat_smile:

I don’t know :sweat_smile: so simple

def majority_element(array)
  candidate, count = array[0], 0

  array.each do |element|
    count == 0 ? candidate = element : nil
    count += element == candidate ? 1 : -1
  end

  candidate
end