Tsarin Abubuwan Ciki
- 1. Gabatarwa
- 2. Hanyar Aiki
- 3. Cikakkun Bayanai na Fasaha
- 4. Gwaje-gwaje da Sakamako
- 5. Misalin Tsarin Bincike
- 6. Ayyuka da Jagororin Gaba
- 7. Nassoshi
1. Gabatarwa
Rarraba Ra'ayoyin Gani (VCT) yana wakiltar sauyin tsari a cikin koyon wakilcin gani mara kulawa. Yayin da hanyoyin koyon zurfin zamani suka sami babban nasara a ayyukan gani daban-daban, suna fama da iyakoki na asali ciki har da yunwar bayanai, rashin ƙarfi, da rashin fahimta. VCT yana magance waɗannan kalubalen ta hanyar gabatar da tsarin tushen transformer wanda ke rarraba hotuna zuwa ra'ayoyin gani masu rabuwa, yana kwaikwayon iyawar ɗaukar abubuwa kamar na ɗan adam.
Mahimman Ma'auni na Aiki
An sami sakamako mafi kyau a cikin ma'auni da yawa tare da gagarumin bambanci fiye da hanyoyin da suka gabata
2. Hanyar Aiki
2.1 Tsarin Rarraba Ra'ayoyin Gani
Tsarin VCT yana amfani da tsarin gine-gine biyu wanda ya ƙunshi Abubuwan Rarraba Ra'ayi da Abubuwan Warware Ra'ayi. Mai rarraba yana sarrafa facin hotu ta hanyar yadudduka na kulawa don ciro ra'ayoyin gani, yayin da mai warwarewa ke sake gina hoton daga ra'ayoyin ra'ayi.
2.2 Hanyar Kulawa ta Ketare
VCT yana amfani da kulawar ketare kawai tsakanin ra'ayoyin hoto da ra'ayoyin ra'ayi, da gangan yana guje wa cin karo da kai tsakanin ra'ayoyin ra'ayi. Wannan zaɓin gini yana hana zubar da bayanai kuma yana tabbatar da 'yancin ra'ayi.
2.3 Asarar Rarraba Ra'ayi
Tsarin ya gabatar da sabuwar Asarar Rarraba Ra'ayi wanda ke tilasta keɓancewa tsakanin ra'ayoyin ra'ayi daban-daban, yana tabbatar da kowane ra'ayi yana ɗaukar ra'ayoyin gani masu zaman kansu ba tare da mamaye ba.
3. Cikakkun Bayanai na Fasaha
3.1 Tsarin Lissafi
Babban tsarin lissafi ya ƙunshi hanyar kulawar ketare: $Attention(Q,K,V)=softmax(\frac{QK^T}{\sqrt{d_k}})V$, inda Q ke wakiltar tambayoyin ra'ayi kuma K,V suna wakiltar ra'ayoyin hoto. An ayyana asarar rarraba a matsayin $\mathcal{L}_{disentangle} = \sum_{i\neq j} |c_i^T c_j|$, yana rage alaƙa tsakanin ra'ayoyin ra'ayi daban-daban.
3.2 Abubuwan Tsarin Gina
Tsarin ya ƙunshi yadudduka na transformer da yawa tare da raba samfuran ra'ayi da tambayoyin hoto a cikin hotuna daban-daban, yana ba da damar koyon ra'ayi akai-akai ba tare da la'akari da bambance-bambancen shigarwa ba.
4. Gwaje-gwaje da Sakamako
4.1 Tsarin Gwaji
An gudanar da gwaje-gwaje akan tarin bayanai na ma'auni da yawa ciki har da tarin bayanai na fage na 3D da mahalli masu sarƙaƙƙiya abubuwa da yawa. An kimanta tsarin a kan hanyoyin koyon wakilci masu rarrabe da hanyoyin rarraba fage na zamani.
4.2 Sakamako na Ƙididdiga
VCT ya sami manyan ma'auni na aiki a cikin duk ma'aunan kimantawa, tare da gagarumin ci gaba a cikin maki rarrabe da ingancin sake ginawa idan aka kwatanta da hanyoyin da suka gabata.
4.3 Bincike na Halin Mutum
Hoto ya nuna cewa VCT ya sami nasarar koyon wakiltar hotuna azaman saitin ra'ayoyin gani masu zaman kansu ciki har da siffar abu, launi, sikelin, halayen bango, da alaƙar sararin samaniya.
5. Misalin Tsarin Bincike
Babban Hasashe: Nasarar VCT ta ta'allaka ne da ɗaukar ɗaukar abubuwan gani a matsayin matsalar rarraba ra'ayi maimakon aikin daidaita yuwuwar. Wannan da gaske yana ƙetare iyakokin ganewa waɗanda suka addabi hanyoyin da suka gabata kamar VAE da GAN.
Kwararar Ma'ana: Hanyar aiki tana bin kyakkyawan ra'ayi: kulawar ketare tana ciro ra'ayoyin yayin da asarar rarraba ke tilasta rabuwa. Wannan yana haifar da zagayowar nagarta inda ra'ayoyin suka zama masu bambanta ta hanyar horo.
Ƙarfi & Kurakurai: Hanyar ta warware matsalar zubar da bayanai da ta lalata hanyoyin rarrabe da suka gabata. Duk da haka, ƙayyadaddun adadin ra'ayoyin ra'ayi na iya iyakance daidaitawa ga fage masu rikitarwa daban-daban—wani maƙura wanda marubutan suka amsa amma ba su magance su gaba ɗaya ba.
Hasashe Mai Aiki: Masu bincike yakamata su bincika rabon ra'ayi mai kuzari kama da lokacin lissafi mai daidaitawa. Masu aiki za su iya amfani da VCT nan da nan zuwa wuraren da ake buƙatar ciro siffa mai fassara, musamman a cikin hoton likita da tsarin cin gashin kansu inda bayyanannen ra'ayi ke da muhimmanci.
6. Ayyuka da Jagororin Gaba
VCT yana buɗe yuwuwar bincike da ayyuka masu yawa na gaba. Ana iya ƙaddamar da tsarin zuwa fahimtar bidiyo, yana ba da damar bin ra'ayoyin lokaci a cikin firam. A cikin injinan mutum-mutumi, VCT zai iya sauƙaƙe sarrafa abu ta hanyar ba da wakilcin ra'ayoyin kaddarorin abu. Hanyar kuma tana nuna alamar bege don koyan 'yan harbi, inda ra'ayoyin da aka koya za su iya canzawa zuwa wurare daban-daban tare da madaidaitan daidaitawa.
7. Nassoshi
1. Bengio, Y., et al. "Koyon Wakilci: Bita da Sabbin Ra'ayoyi." IEEE TPAMI 2013.
2. Higgins, I., et al. "beta-VAE: Koyon Ra'ayoyin Gani na Asali tare da Tsarin Bambance-bambancen da aka Takura." ICLR 2017.
3. Locatello, F., et al. "Kalubalantar Zato na Gama-gari a cikin Koyon Wakilcin Ra'ayi mara Kulawa." ICML 2019.
4. Vaswani, A., et al. "Kulawa shine Duk abin da Kake Bukata." NeurIPS 2017.
5. Zhu, J.Y., et al. "Fassarar Hotu zuwa Hotu mara Biyu ta amfani da Cibiyoyin Adawa masu Daidaitaccen Zagayowar." ICCV 2017.