Free Databricks-Generative-AI-Engineer-Associate Actual Exam Questions - Question 11 Discussion

Question No. 11
A Generative Al Engineer has successfully ingested unstructured documents and chunked them by
document sections. They would like to store the chunks in a Vector Search index. The current format
of the dataframe has two columns: (i) original document file name (ii) an array of text chunks for
each document.
What is the most performant way to store this dataframe?
Select one option, then reveal solution.
US
CN
Carlos N.
2026-02-21

B. Storing one chunk per row makes the most sense since vector search works best when each vector corresponds to a single row. Having unique IDs per chunk also helps with traceability and retrieval speed. Options A and C keep chunks grouped by document, which could slow down searches because you’d have to extract or flatten later anyway. Option D adds unnecessary complexity with JSON files, which isn’t as fast or scalable for typical vector DB queries. So B is definitely the more efficient and practical approach here.

0
SI
Sohail I.
2026-02-16

It’s B. Storing one chunk per row with unique IDs lets you index and search vectors directly without unpacking arrays, which is faster and scales better than grouped or JSON storage.

0
MH
Mason H.
2026-02-15

It’s B for sure. Having one chunk per row means you can directly map vectors to rows, which is way faster for querying than dealing with arrays or JSON files. Options A and C keep chunks grouped, slowing down retrieval.

0
MH
Mason H.
2026-02-12

B/D? D seems like extra overhead managing JSON files instead of a Delta table. B is more straightforward with one chunk per row for quick lookups and easier vector indexing.

0
SH
Sohail H.
2026-01-30

B Flattening the dataframe to one chunk per row makes the most sense here. Since each chunk needs its own vector representation and unique ID, having a row per chunk not only helps with indexing but also speeds up search queries. Keeping chunks grouped would just complicate retrieval, because you’d have to unpack arrays or do extra processing every time. Plus, saving as a Delta table keeps it scalable and efficient for updates or queries.

0
PW
Peter W.
2026-01-27

B Flattening the dataframe makes sense since each chunk needs its own vector embedding and unique ID for efficient search and retrieval. Grouping chunks together would slow down queries.

0
CC
Chris C.
2026-01-25

B/C? Flattening helps query speed, but unique IDs per chunk seem crucial for retrieval.

0
TU
Tom U.
2026-01-15

Option B

0