Bejeweled 1 AI (Part 1): Getting Board Information With OpenCV
Updated: Dec 18, 2021
When I was a kid, I loved to play the original Bejeweled (Diamond Mine). While the game is much simpler than the later releases, I found the music to be the best. Since I just installed Windows 10 on my MacBook, why not try to create an AI playing bot for Bejeweled 1.
Why An AI Playing Bejeweled 1 Bot?
There’s no doubt that you can search on Youtube and find examples of AI playing Bejeweled. Many of the bots play Bejeweled Blitz and Bejeweled 3. So, what’s the point?
The point isn’t to make a groundbreaking proof of concept. Instead, I have the following goals in mind for this little project:
Gain some experience working with OpenCV.
Build two types of AI to play the game:
Before getting into the code, a few notes need to be mentioned:
The original Bejeweled was known to be run on a Windows machine. While it’s possible to use Wine to emulate the game, I’ve had issues with running applications (especially games) on Wine. Thus, I’ll be avoiding emulation.
I’ve heard that there was a Mac OS version that was offered. Yet, I’ve never seen it offered by PopCap themselves. It could have been offered via a physical copy. Even if I had a copy, I doubt recent versions of Mac OS X can even run the software since it could have been written for Mac OS Classic.
After PopCap got bought out, they no longer offer the game for purchase. Instead, the only way to buy this game is through Steam.
I don’t have the game in fullscreen.
As a result, the code base will takes these assumptions into account.
Handling GUI and Loading Screen
Some programs provide an API that allows you get information about a current state. In our case, Bejeweled 1 doesn’t offer an API to get board information. Instead, we’ll have to take a photo of the game window and dissect the contents to get the board information. We’ll be using OpenCV to get our board information.
I created a class for managing and interacting with the GUI.
Launching the Game
The first thing that needs to be done is to get the window handle of our application. In Windows, all GUI applications have a window handle that gives information about the application. In our case, we want to get information about where and how big the window is.
The following Python method launches and finds the game window:
def launchGame(self): # Launch Bejeweled 1 and get the window handle. steamPath = "C:\Program Files (x86)\Steam\steamapps\common\Bejeweled Deluxe" if os.path.isdir(steamPath): os.system('"C:\Program Files (x86)\Steam\Steam.exe" steam://rungameid/3350') else: raise "Steam or the game isn't installed." hwnd = 0 oneTimeCheck = True while hwnd == 0: hwnd = win32gui.FindWindow(None, "Bejeweled Deluxe 1.87") if hwnd != 0 and oneTimeCheck: oneTimeCheck = False hwnd = 0 time.sleep(0.5) time.sleep(0.5) self.hwnd = hwnd
Notice that we find the window handle twice. When automating these kind of things, not everything is straightforward. In our case, it might be possible that a hidden window handling background processes is briefly active before the main window shows up.
Getting An Image From The Window
After getting the handle, we can then get the game screen. For readability, I split the code into two methods. The first method gets the coordinates of the window as well as the height. It’s boilerplate from this StackOverflow question.
def getWindowDimensions(self): rect = win32gui.GetWindowRect(self.hwnd) x = rect y = rect w = rect - x h = rect - y return (x, y, w, h)
The second method then uses PyAutoGui to capture the screen and converts it into a usable format for OpenCV. Part of the method borrows from this StackOverflow question.
def getWindowShot(self): region = self.getWindowDimensions() img = pyautogui.screenshot(region=region) return np.array(img)
Handling The Loading Screen
Before getting to the game screen, the game’s assets must load. Afterward, we need to click on the play button.
For simplicity, I used the OCR software Tesseract to get the text that we’re looking for. In addition, I also restricted where to convert the image to string. Once the place of the letters were found, I hardcoded the offset of the button and clicked the button to play.
def handleTitleScreen(self): canPlayNow = False gray = None W = 0 H = 0 x = 0 y = 0 while not canPlayNow: img = self.getWindowShot() gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) endX = int(gray.shape * (67/100)) startX = int(gray.shape * (1/4)) startY = int(gray.shape * (3/4)) endY = int(gray.shape * (7/8)) gray = gray[startY:endY, startX:endX] text = pytesseract.image_to_string(gray) for line in text.split("\n"): if line == "CLICK HERE TO PLAY!": canPlayNow = True x = startX y = startY W = gray.shape H = gray.shape boxes = pytesseract.image_to_boxes(gray) break region = self.getWindowDimensions() x += region y += region pyautogui.moveTo(x + 108, y + 31) pyautogui.click()
Note that if a game was previously saved before exiting, a prompt will ask you on whether to continue. The code doesn’t currently handle this functionality.
Grabbing Board Information
Now that we’re able to get the game screen and get past the loading screen, we can now play the game. Before we can program the AI, we have to know the state of the board.
The following picture shows the start of a new game:
The start of a new game
For the remaining code in this post, we’ll be referring to this picture.
Getting Only The Playing Area
Upon a closer look, we have a progress bar, a submenu, and the game area itself. We only really need to have our bot process the game area. To crop the image for the game area, we need to get the outline of the square containing the game pieces. The following method will return the coordinates containing the game area:
def _getPlayingFieldCoord(self, img): gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) gray = cv2.GaussianBlur(gray, (3, 3), 3) canny = cv2.Canny(gray, 50, 100) _, cnts, _ = cv2.findContours(canny, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) contour = sorted(cnts, key=cv2.contourArea, reverse=True) return cv2.boundingRect(contour)
Here, we need to change our color space to grayscale, apply a blur effect, and apply a Canny effect. It wasn’t by intuition, but instead by experimenting that knew how to get the correct contour.
We only get the external contours since it 1) reduces the number of contours returned and 2) we only need the outer areas of our image.
Getting The Board Pieces
Now that we know how to extract the game area, we need to determine what’s the color of each piece on the board. The best way to determine the colors in this scenario is to filter out by a particular range. The filter ranges that returned the most white pixels determines the color. The ranges for each color were found based on experimentation.
The following code gets the board state:
def getPlayingFieldInfo(self): # Responsible for getting the information img = self.getWindowShot() (x, y, w, h) = self._getPlayingFieldCoord(img) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # The ordering for the limits follows: # 1) Red # 2) White # 3) Yellow # 4) Blue # 5) Purple # 6) Orange # 7) Green croppedImage = img[y:y+h, x:x+w] colors = [ "R", "W", "Y", "B", "P", "O", "G" ] lowerLimits = [ np.array([0, 0, 128]), np.array([200, 200, 200]), np.array([0, 128, 128]), np.array([128, 128, 0]), np.array([128, 0, 128]), np.array([0, 64, 170]), np.array([32, 150, 32]) ] higherLimits = [ np.array([0, 0, 255]), np.array([255, 255, 255]), np.array([0, 255, 255]), np.array([255, 255, 0]), np.array([255, 0, 255]), np.array([0, 128, 255]), np.array([150, 255, 150]) ] matrix = [ [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0], [0,0,0,0,0,0,0,0] ] for multipleY in range(1,9): for multipleX in range(1,9): imgSeg = croppedImage[12 + (52*(multipleY-1)):12 + (52*multipleY), 12 + (52*(multipleX-1)):12 + (52*(multipleX))] highestCount = 0 theColor = "N/A" for (lower, higher, color) in zip(lowerLimits, higherLimits, colors): filteredImg = cv2.inRange(imgSeg, lower, higher) uniques, counts = np.unique(filteredImg, return_counts=True) counts = dict(zip(uniques, counts)) if 255 in counts and counts > highestCount: highestCount = counts theColor = color matrix[multipleY-1][multipleX-1] = theColor return matrix
After running the code base, the following matrix is returned for our image:
[ ['W', 'W', 'O', 'Y', 'R', 'Y', 'Y', 'G'], ['G', 'Y', 'P', 'W', 'W', 'R', 'R', 'P'], ['W', 'W', 'O', 'O', 'B', 'G', 'Y', 'W'], ['P', 'R', 'G', 'B', 'P', 'Y', 'P', 'Y'], ['G', 'O', 'O', 'B', 'O', 'R', 'B', 'P'], ['Y', 'B', 'O', 'W', 'W', 'P', 'W', 'R'], ['P', 'Y', 'B', 'W', 'B', 'B', 'Y', 'R'], ['Y', 'G', 'P', 'W', 'O', 'G', 'G', 'P'] ]
If you take a look at the output, you’ll notice that a couple of the pieces are incorrectly labeled. This is due to the dialog boxes showing where to make the first move.
Fortunately, it’s possible for our rule-based AI to make a valid first move. I’m not sure how well a reinforcement learning agent could handle this state. After the first move, there shouldn’t be much issue on getting the correct board state.
Now that we have a way to get the board state, the next part will involve coding up a rule-based AI. After that, I’ll be going over developing a reinforcement learning AI to play Bejeweled 1. The repository is being host on my Github account.