From a126811a15dafc92d74e05ee621af9ad0c98d7d2 Mon Sep 17 00:00:00 2001 From: Gergely Nagy Date: Sat, 13 Jan 2024 12:32:13 +0100 Subject: [PATCH] user: Document how repo language detection works Addresses Codeberg/Community#1391. Signed-off-by: Gergely Nagy --- .../language-detection/repo-languages.png | Bin 0 -> 11709 bytes docs/user/index.md | 1 + docs/user/language-detection.md | 116 ++++++++++++++++++ 3 files changed, 117 insertions(+) create mode 100644 docs/_images/user/language-detection/repo-languages.png create mode 100644 docs/user/language-detection.md diff --git a/docs/_images/user/language-detection/repo-languages.png b/docs/_images/user/language-detection/repo-languages.png new file mode 100644 index 0000000000000000000000000000000000000000..dfca6574c04d9fce7d225e095b5c6e2b8facb77c GIT binary patch literal 11709 zcmd6tWmKC@)UJWzrMSDhQ{0M63lw+Pk^;pYilk_f;_gr^I1~v`D8UKR28w&2xI3Km zUEh!MzTZ0k&z~e~&CIi&nS0ONJF~Btx0)(1uqm;TkdR)esVeFqAwBCw{C0CPPY$_Dt>-F{ft-e5A)$A%-zrP8AlE1?+-%T4oI8_&6gw7@?rx#|0QYe6-k=V180^$sZH z;*TZY;**Bkd^mv`L)a^Nd`wi7|9tflt5})fuW~$o+Bo@d+dl)aQNxi2U;Xcah(_|i zZFmEKgeaW<^;|@ArikN2F4`zxPY^u;V!W-Q)x*UspNZJbX39)?j;jO&*70=1CUX+A zx#F|nl93>J`0bD?+Va1T(j^b;JnELz=)Cd?}zWC{t+s3x9>GT zVp#fFbXm-$<(EP5o*DjXw;aU_w13U!jT|bd=kfNXiz0z#{c^yb#`nQE6>L;eJ`@eC zzW(o?c`?oDtXl9c{}0TodRm#&ORTMeRMw`-k!)aHM`og ze3E#$TY1vl19iFvJf0O4SUla%R+ME}bP%9qA)Won_g(hi{qmYut-7XBJ>a6+GVgNH z6I%QkE`8*`dpgP&`Zf?6a@jOP)_n{-?_7D{1+SCTsRkZ_eFJ_l(I#t3o%~og38f3z zDN=fpybG8`nOa)s(a)r!Aj-t0nDyPB_q*fBl24+_(EIuLSkmpR3nAEqYT=cSk)> za+W>!TYOq7G~RmntIBTdyx9L*W^#wWv3IsW7w#CkEmUBfWRky_7;;|nZn?{U7swZ6 zO?YR?+Dmsk88uDTb3MRJ_ISOKs09GEJ4o!7rcBFW(C8JOm+=L2ysJ*JL+Kh9Stx}D zZ42#o?15gp1#M8p$h-`PtaP;e!JY&3Jq902p$Jm=q6b*lUKq|q}QXFrtw0K zBhHheq?jMJ_(IiY$~B(jeW)NKUcvZR@Zkuf$bF2uz0f*{>RJ0lHJws$zq!s3ijc6( z8)jT_eTUFdqb_KRi%#*Qr;(r*XQDG{sHNW;uw#O|DmZi$jZo$4)=( z*?mJ|Fn91D-2(N*z{L$^>7G2yV?<jE~hfrbB8_)o?-eUFm?vI=vb-(}b8v^+-~!xpec z48Q&tqTF5mE7uTFLirq}<=UV%GVqQPR`R%b>(-r1&%-6KEklpM3iW*HD8zT(spRwY zUj^4YT(RnMDln|?TRs6ab&<8|eC zsnnnOGFp?9p5Lux5^!^H8#?18fS>saHRNLB^)#@jo7?;Xx;L#cyOM5T>bdoDFBwqA z)3(Ge?R_`STJPw-H=NMR)3&IylRUk+6Fkb@^d|XIxk9gr#ZC;LoE#{XBC}s+7HbQ55LmAJcM>f?dRsVJ`hn_(vjivp>%Z{ zntSWpA~}p_lPs@7oHQn$v*jY4-MjkuoXKTbcW-7jQTm4;fv=W}!|Og6S=Wm$``MZV zU;LUWp}qK8%&#+1w106y6MR0ki&(6n1E}wUOIxuKfdDeF2XAp%a?}{IZhIEoEjF)&+>rniN5Ws7*f7Q=z;1dmWRlX&FGFcP z3z#B2%m;7x!8Wjiqez*j8)&FS>8-nKRNMJUBHhU+{Yd%HtIidgai~&=`NlQ94r^~$ zk9-z#7d0 z_a`LwYSsfc&Q~Mwcu(``?&b!|6N5IHV%!8($bI-p|+X} z)Tiq~SzcI7WBvQ5m8S>GdFv<|>}ODfgRRU2KV7!BY=LsqQ=Tv<$Qa@*1M%yb7(y81 zGhlA$dO#c+Hc<;kC7{4e*XldMEU#W<4C1{HYz_IwTA?R#mc(bh-`ZL$Z|ZNha}pmH zh0l=!i`xO+X#=-M=A5Zc30kacH`0DbKqbb`WW3O=&!45fd$DyCUMZ02Nqxap&M1z! z2|Y!GOP3!R!sC0e^ZsTouXRc-Zl^OyXC2-JnBN#LE|^p}l@YTBHeG!^kXWldx1dbB=*7c9~65<`1%}!**>?HC07Q;3dB)OHU8ymRjHKT#nQjB`(M${{1#-!(VG1fB>&+zOBb9 zIUUUQJYL981rACsYcG7}3%wcL&9qJ6m?%+7&E>iKRTt9o)uE=*oiwiHUO%pQa-|J? z;G%R`bw9%$MdM@S!OM$ckRd5aDh1hS)8O_!sPPuTC8xxYH*FLp9|qc!9uZ`n{-*I= z)aJIqAzvcL&L*_FDjgaUFGs!2QyK*4JzE4|z62g2(jl!zat4uj1{{Wm^dP)^&)~M> zg}G!JL-yk>n4Z1%A^@9juE08q6vTQHeJ=KYmn>HwDV_FKd%whN4VM&G^w0vAx>up3v#XyrW^TW90 zc$_jjwTK==qZD_F68jHUZOI}h(F?D6uVq9!Oh)Px8s!!B5p6n?==WPt=I2z#ns}~l zGhOSYOG!_l(&O}NKAw0B`c3w|J_T&fAbHcH2DluN57Vo&ecX=oNX$>04 zgzZ;dboMNui^qNu+5oIz*MS-}_H#udD|EgV3Xx^Mjqtbpnv_J0DENlhp|U@W;K<4r z9XucURVQA(b8GR1#XJg%rGurV+|+aAO>gq z&i0LTgGU5sRJuZ#lp0LRVv#x_w+vdjr;bbbuJfOxUDwt)EBvmd*K<6#OlNCgmoD+k z99o8*{7u$Wc)m-1@3TX%-^55&muC!YuWj7@&-*EN9`o=P~1z7h?T0(MQH(2Wz+3 zGZ61?ah-#z{8{3CK?MEP@O3d$<@M`Ix2OhH@=~QCAweBBjXFSVT(j6-;RA>YJl0v! zKe_8s`a_SFN#o=_I6*U43wO8d1&6tQCDA#(UrxUlVEMX{zk?#MyA39Ink>iEd;0^~ zafLLEKGpP(vM$Ql+INcZG1D1StYiAKFIX(S;7_t}IRi3uo}Ys7@we020sh9nb5CMx zBqB)O5m(u)?L$0jUSJp3Q!?O3Zb9t|`5o#=MxWzl*H1}_o3MS&Rb}o_&)PJ%HXm6d z>-(xP5~Rs9@MShrcm!-D4y@cxsk{w2zy?AraOfdS*T|=ZS9A5Q20>soC(zcaungC-dG)iFwSb;K^V=E`& zoDn+W&ER{g_N>%#4@#9^ft0N${OQN=c*_+29RrlHnc;9ZX!5LV=;OWoNy(f&l-%v% zZ|=4JM6x)#PNNglDA%eh84^|VyN-Nx|Mh_mh2JLvG$0lSPpZzYjPo-)Xbkx0gTEfd zhT;vmtvJsXBIC0tD%{$mm?EDmH5ig5ZMV`m8MCJ;mSR{i_bV=6S@#h?I1^3~YkHCTJ7iF_2NPaE|#oxyIW_E;fu=VVGl8GwV$6%S2;b5H|> z5?D0bIgxC5lWi7PzK@&^vd;DIfiZ1-(XHj)?j}|Tg4(0D3!YbIQk+h!qB|5%ZCe`PUvzH>y{bv+f=oD44S=D*G% z^&d|A_<$=gf|k4)UGG7@qY+6KKdA1cXp9pCbB&yJ08BNBQ}MIPrP-mVv~x%0VAsBy z;CSJ`_E&u4BugRFzO&ULhZ5eursZvsaaU^eP zGc@`-LH$9D`R@w2BF>K5Qm&r_YwM%_FyK>@Rc!^d$OkpY<_6O9mOZZlMpLd8LV)=)h z|D+=7gsapxlO~(dgg(puaA~LazwIKEH$PablNCG530eV1=aV8a=dUGWKG-ViTbS*Y zHhg60<(7=>gLWp940W`Uq)*8z>d!IiL^%ze5l9^5|A2QnVcTb7qO-B249vqrN>=fe z(gw3mZ6bcMiKjr9o%N64^zqh^%C3rXw)_;zQVeExOn~+?}+|ei2UvZ;Kr)a3B_fTjZG;Q;-^gS|; zK<>^KG!MRIh~PUY3a8)}+({o9;ErZ6Zc3n&%#M(37hO=Hk2kKf&yODYMN_Q%+0PSL z1xjnZJO%kNPea*rJ$(iVQO2Vq>feU%Y5E_V#X={MYkV&O&@G{Snx652Hgcq0E|^#> zw;LYIY>>m44W{GQr1ee8j)ZnacLQO+B>OH*{v|azaQcfFNV&jjk8wF7{fM+*y3)89^f@rxO+R;n2;N5fo3ZE|czEfH z?D%4b!EmRC3YFTRD()Vn(N5Oa^~=-W$GoT9;{~ZDWi}|36-QBxsGJFmL?y~9gjsqy zVP}UXCFI+GF#pG}nnF(NUfzLjzzKWJhv#L}`cTtzqljF$*94+Sp8kPFKg8Q2PBj`v zP{pEH_nT)l@lF>{*Bcufeo&IR8# zuPMXj5r&9ZFHo!M4uMF=gpAm8jqgMi)Jrh*Z9CH<0#flPjz@1m8-X z#lGOGI>?KZB_GA@49daW|C08wD*!R@!FCj@pmkGK^LqY(ORjAcxfPC?QSwIMecBnL zKkb1V3<;f>dGtQjRaYdMLib>PcmuyuB-CGFR3&Y*PnYqHBA-24y?0UuR{ce>FmXcy zQpWTCcQXMt1@@W4cfNI~nNvvO1i%4)L$)qqM#DzhFLgqE9%wtcGk_(fzM3XtV~Bq& zk1Yes!owIxV8@Xe-{z5GkPr)8!H~y8^qwEiPsod6I?;5PeOdl6C6nC&DdYtC`O_eK3%d8iY48&|7 z3%A35J|=K~fR6uVG&_DQolE5Ig;E$A^c6eT4)=U2I+b<1UO-A%>{#~OXs^7@MO!KX zfV9rOrHoHZ<1ltYBu-U)3nr{kX4dY&FrP^3`Bqg~&3dY{y75Gr^Hu!z*W+3bUhAcc zQvzc}p4Rv`+nMeR;o+G2c2+iN)Lw2XdqTVuj7G%F=~mc!03$0=fR!v}{fpGCO!>E@ zIW4~9?1q=FsdZT+8`kb5-{s@U$wuKwOd*WGG5h66?k%Ai&l4KYm6h$h8%JJ=iP)F# zDx5(=PBG}Oa+P#WAPl8l16vZvTPv`L1T2{y^*!s{R0xg!;ax(@vR?S6pPB zNgwpoqCF@E3D~tgUm|(;>D7$6qA%7Ux1qO6Bt^2xOar>h5k(!NQN1mBndgPh%VZrl zKHEup{i9*HM&0N)CVbV{hT(;km(Er9%gqey-({R}eh%wVFo6?(pKylT=QDhdf0)Lt zp04vqbJC`U(CwxpnPQN;?dU^1M1KBS3%eGKwnvFi!EA8l)WGClwyQ1g+-d-fw;l>4 z#eQ4;`L$YwpZ)J2)?+`K2sAVlvXo8XH060Jo=+^#)>4pR*Th2T=d(8?|h^JNS?#&+0+E8QUOm= zbbq0GPof;z2w@9VZY8L)?V|3q z=edGwou(Xx-Fvbt0ad9TVp|XNvm&L_U0o69l2JLkTSMs{uIX%gT_K-P71Mfeu|mXD0tu6<0*N= zU^fj3P_a)jtP-duL})FKsPZta)CRAG6``JX!AQC}pI8kipz44d$ymOcy*{pBHex}p zy!H=Koi9=ct`hzpGJDeF|%bp{{xlqlugz zE%hIbD1jo=x4R0ezAN~zA(wo(u1L~In_Kh`vN;d zrg`*>c2+>o7@l2o5PFN_u!XoSPark-sB|v7-cOL&w|szdvBbOeq&9S>X})M1^+XIomIbM`g;8JS-K-u5hsz=24_#`Mie2# zeydsDH#Nz=&m^{MMM*Hik4AUDNgXyF)RQIA&~h3|LTOIEqc?ny3Y+z|9C>MO8_IxReVU1 zf{8&q8kR!{SF60MFU*aal5eZ=Hc zBjER!1^8PGe0z&;_AJZ(xAO4a`!Hd@AE)r~mGF0GHrf+CML0WbEDbF$18%xW12)s` zC7%xrqQ`K5qhmygkOL9#E*bTDctZ`|uc=g4>xgyG8WEj8+nhjg5*XEg#`cwq#%+|I zA?S6Q*GSrp^-0t3(;wJCJ+!>O@lnMF62{at$#x(yE`-gqDCI5j7*?#s1P=_uGn~v= zm9H#IaFREpdnTCm^zIHmuGVJWRG(?y)_6L`gQ%hD(ZTgqKAxR$9j1A5bs9grll!=o zqXM0)hh+5=V~mukw|Xb@^z+lwW{OIVw;xWrM#;NB#J428dClGAG*v=9Z{Yfptc~-? z`jJ!w0OI&2Z-Vuq{af78=-_qgXT)tHymYiU^{ z0LyN`x4J6-XwtKTrs~5g*-dm%JV)0sal&e_1u#jhS;GmzG^_@#`dbY#jVf4((4MXv zsU#aNo9(fbJbQ?cNd{!5y2~%Kf1l?fvis!qO*^FJb&r)(HjWCD`9QOTtK;$})#DYgI(6U=?ms=%E@FN^izSI4j0E(E+l zfAR&3t@qWw2=Of-r!LE)#kGkkMv9)1n=exsvTjWvN9`i9_1)!*SVDxMK6{FsQl$gI zp`MXSW~K(%e1~nXbKpf*lQ!iy5rGlQR#X#ej?lho;vN%rO4{wu;9LSWI_YvA@t?#7 zPg@)|i;~x?5p;Ugpdc;lR4>tl;9@9RPe8?1Ov!R`>XzDXNeJ3`w$XAg5gX#7&gM!i{%=ZUHJtGjAF zy1T9r0VyGrDH?yIQ8d&j>v*pKsP~F!3^VsrYtC2s4RvOHkfH6EDnrqNA&t1A8zGn> z`y&H09n;TLq`e9e@;D{l<9la}x;1}?pL=Y`Ak(%HKVgR62*_*na3nIS-e@+zinBUb zQxcQ8GY&0_&JuXrEEZu8)qB*cSnDm%$v*)Wh99gZ^S!uH#Y{0r{aGU}gQf}`5R}Y{ zdw+q10g&{vYn_s6HXtNti3ddKh3Xd$U-o-HST)&~pJDm%Q5||zD7&$KXvsQOt6xaw zE4v=@3o)20_7{k=xgP#3aS>b#7d?J~YC>SOs?;~4T~s^!!AIP=HV-bElt+n)gM&Qr z;S{~V;~o-yzS$ITBD&-i@ar_b6+*LbF18Sx*3O&IcxOi@xH`YAkRnSziB< z2YHMXHN!`FCE(&nUJ6}#VGMwHaPpEia9^7`IJrOA-v?+dnIN}Q7o?Jp0W^%%Ppbor zBa^oBd1>-y)N=*vOloukZ~;*3=q+E)mv5bLk}*mJ+mpnK66JTcijP_sT%~Ei{M@!| z`9cUhteh1wnTqYe&lUgRHic9O%EmXsz)Z3d-)VgGO9`haV?j*0PUT`8wK7kJ{2D=8 z*H(SaJ7RC`uHKuO)4a_W!wMOp@H-#qC z5H*>7n5fA4eSD&{wU3jC3pN_}wQEi^9^w`4_cn;SI}hUT`^2Cz{ddJJO|+kStGXW} z*<*{XvWgwBU!7b^z&Po+^jUkQhK5>gNr)K~@@Wzhh71U-TqFkZE0tPF!nO6})=>R> zhWHTNk8-?uNoGV3iS1B9OF*A(&A;d)1kTp~@fH56wcn`1eZ04LZxtst3+v68NvNc& z%C`n6^@Xm!nih#N3Bwk1b!JsIr?!h3a1vozvUv!TA8^c?+b(ekVudRLUy&#Sl_bNf zbP2d`I3Ft^Y&d~~G``%oumq&DQ;7}T&?L`d%t-YLMUpL83Hc|#uRluu0~aA96pLDA z!8>Du5m-60(|L{wv`o@xWZB&Y=D4&{yvVyO(GdAvDrTD{I_HkcT+qY;B~`0s>Xq@$ zFHHgrUJp@$7<;9rZewb4zS?uZZ(#l(&wjgj<8csw5X{9BAKMVkm;M(xSlELQ;gBzX zYW^cb5>|b&_LiBkH59>Yg=$}DC0BNM=I>4D?4={Yg&uqRJ;$DLuNWcoTF>gEW{S>J z7qy;`H9es;B(NH2!<9atoDy9XjdjDiNXMfg{10e|Bl&lB!KiOe;IaqJGqKOM zNs{Hg9x}wv^Wg?!jW%yFXuRHPm&S_e1(l^4a6TTAA;Z#b^yCxt?YD!|Q}*~huXM7E zfjhxhOQJN0y4L{McgcI!>pL1*>0erDh)OwEbj~Vby;BgGbh^f*Sz(gkk=rrX9pbbj zgc6TQbin?^hbPATulXauN7lda*~|iMpQHX9_DLt=IyQDA3T6>_;B|j8NVchUeVVL7 zYg{OZ09@9i3lRup*D-oQ;9>g!{DHZ5ULjx$pXi4-0xIEsl(|3dkHZHjHMSzCM8<;a zuO$hbx2oqFxcnZPBdFJ7LQw8#A?;)5lFuTXQiFB{!Ix?XD#GXF>Jyb~rU3$i@Y*fQ zAZ3`78v8Hc*~`y>K=v}1FOy`r`1Yn$=*;!Bw!^&NB;z}*OIi<{x+lM0IV;7c+vY(m zngP$eo$fDV_A|sB(*E9cPKmBdS;O%xy{44j{pEQQUU~4I5U5aVQ~Pi3{6kxcOQ^9p zmC6)cR1@fzJt#}YTI6|6tDz;s4@l5?5{*dSsW1tk?AQ0>zC(~+RMP|mi8TKbm%j!P zMzkDpR(OQb9*{5CPTSzU_H2Kn94Vt+x5h)*? zoF3rr-92;%PGBf6?Ayzz3gMtEUdi@#Ell1tg-=69u5M2dcAi8A!?=Ei2v?oiads$Qa)|rc(nf@Q zGv&V&{JR{|Oo-Qa+;I*iN6aD7*Dp)T_o#4rO!GIj$WXb#h2eI&A0{{R3 literal 0 HcmV?d00001 diff --git a/docs/user/index.md b/docs/user/index.md index 3f570479..dd26d54e 100644 --- a/docs/user/index.md +++ b/docs/user/index.md @@ -34,6 +34,7 @@ involved in running it on their machines. - [Actions](./actions/) - [Merge Message templates](./merge-message-templates/) - [Webhooks](./webhooks/) + - [Programming language detection](./language-detection/) - Authentication - [Generating an Access Token](https://docs.codeberg.org/advanced/access-token/) - [Access Token scope](./token-scope/) diff --git a/docs/user/language-detection.md b/docs/user/language-detection.md new file mode 100644 index 00000000..77570750 --- /dev/null +++ b/docs/user/language-detection.md @@ -0,0 +1,116 @@ +--- +title: 'Programming language detection' +license: 'CC-BY-SA-4.0' +--- + +Forgejo tries to detect the languages used in a repository, for each file, to use this information in a number of different ways. The most prominent use is for the display of the language statistics line at the top of the repository's home view, but the same information is also used to exclude generated and vendored files from diffs. + +![Repository language statistics](../_images/user/language-detection/repo-languages.png) + +This comes with sensible defaults that should work for most cases, but it is not without faults or compromises, and there are scenarios where the detection needs a little guidance. Further below, we will explain how it can be influenced, but lets look at the defaults first! + +## Built-in language detection + +Whenever the contents of a repository change, Forgejo will look at the files, and try to put them into one of the following categories: "Programming language", "Markup language", "Documentation", "Dotfile", "Configuration", "Generated", "Vendored". The sorting into categories is done by the [go-enry][enry] package, please consult its documentation for the specifics. The library is _mostly_ compatible with [linguist][linguist], which is used by some other forges. The statistics are calculated based on the main branch of the repository, only update when the main branch changes, and the update may take some time. + +[enry]: https://github.com/go-enry/go-enry +[linguist]: https://github.com/github-linguist/linguist + +For the repository language statistics, with the default configuration, only those files are considered that match the "Programming language" or "Markup" language categories, everything else is ignored. When viewing a diff, all files are shown, but "Generated" files are collapsed by default, and "Vendored" files are marked as such. + +### A short explanation of categories + +While some of the categories are rather straightforward, a little explanation about them does not hurt, especially if you are considering reclassifying files into another category. + +**Vendored files** are any file you have checked into your repository that you didn't write, but imported from elsewhere. These may include dependencies you want a local copy of, at a specific version - such as JavaScript libraries or Go packages. These may inflate your project's language stats, and may even cause it to be labeled as another language. Marking these files as vendored makes it possible to ignore them for statistics. + +**Generated files** are - as the name implies - files that are generated, but still checked into the repository for one reason or another. These may include minified JavaScript, compiled CoffeeScript, various lock files, and so on. Similar to vendored files, you usually do not want these to show up in language stats. They're - unlike vendored files - hidden by default when viewing diffs. + +**Documentation** are just as the name says, documentation. This includes files written in Markdown, AsciiDoc, Org Mode, and a number of others. + +**Configuration** files are typically used to configure software, and as such, the category is not included in language statistics. Languages like JSON, TOML, YAML, SQL, and XML - among other things - are considered documentation by default. + +**Dotfiles** are files whose name starts with a dot, which by convention, suggests they should be hidden, and as such, they are excluded from language statistics. + +**Programming languages** and **Markup languages** are more or less self explanatory. The former category includes languages like C, Go, Rust, JavaScript, and many, many others. Markup languages are CSS, HTML, Jinja templates, Jupyter Notebooks, and numeruous other formats. + +Please consult the [enry][enry] or [linguist][linguist] documentation for more details. + +## Adjusting the language detection + +Sometimes the programming language of a file is not recognized properly, or it is miscategorized. Forgejo provides a mechanism where the language detection can be told about the language of a file, and its category can be adjusted aswell. The same mechanism can also force a file to be considered for language statistics, regardless of its category - or the opposite, too: to tell Forgejo never to consider it. + +The way to do this is via a [`.gitattributes`][gitattributes] file. This file has a simple syntax where each line is made up of a pattern, followed by a space separated list of attributes. There are many attributes that git itself supports, but we're only going to talk about the custom attributes for language detection. All of these have a `linguist-` prefix, as that is where they originate from. + +[gitattributes]: https://git-scm.com/docs/gitattributes + +### Overriding the language + +In case Forgejo does not correctly recognize a file's language, you can use the `linguist-language` attribute to override its detection. The language names are case-insensitive, and may be specified using an alias. Spaces within language names must be replaced with hyphens. + +An example showing these overrides: + +``` +# Reclassify `.pl` files as Prolog +*.pl linguist-language=Prolog + +# Whitespace in language names must be replaced with hyphens +*.glyphs linguist-language=OpenStep-Property-List + +# Language names are case-insensitive, and may be specified using an alias. +# All of the following three lines are equivalent: +*.es linguist-language=js +*.es linguist-language=JS +*.es linguist-language=JavaScript +``` + +### Overriding the category + +It is possible to mark files matching a pattern as "Documentation", "Generated", or "Vendored". This can be useful when the automatic detection of these fail, or if you want to reclassify files for some other reason. It is also possible to reclassify files that would be considered either of these, as not being "Documentation", "Generated", or "Vendored". + +To achieve this, the `linguist-documentation`, `linguist-generated`, and `linguist-vendored` attributes can be used. All three of these are boolean, you can set them simply by listing the attribute name. You can also set an explicit value by setting them to `true` or `false`. Prefixing the attribute with a minus sign is the same as setting it to `false`. + +It's best to illustrate this with a few examples! + +``` +# Do not consider Markdown files documentation anymore! +# Note: Both forms here are equivalent. +*.md -linguist-documentation +*.md linguist-documentation=false + +# Consider files in `dist/` generated. +# Both forms here are equivalent. +/dist/**/* linguist-generated +/dist/**/* linguist-generated=true + +# Do not categorize `cpplint.py` as vendored. +cpplint.py -linguist-vendored +cpplint.py linguist-vendored=false +``` + +Reclassifying a file will result in Forgejo proceeding with language detection. That may still result in the file not being considered for statistics. Take a look at this example: + +``` +*.nib -linguist-generated +*.nib linguist-language=Markdown +``` + +This will classify `*.nib` files as non-generated, and as Markdown. However, Markdown is considered documentation, which is, by default, excluded from language statistics. + +### Overriding detection + +In cases where a file should be considered for language statistics, regardless of its category, the `linguist-detectable` attribute can be used. The same attribute can be used to hide a file from the language statistics, without reclassifying it into a category that would otherwise be hidden. + +For a repository whose primary contents are documentation in Markdown format, the following override would make Forgejo consider the Markdown files for the language statistics: + +``` +*.md linguist-detectable +``` + +Similarly, to hide a file from the language statistics: + +``` +config/app.js -linguist-detectable +``` + +The above will not consider the app's configuration - in JSON, but with a `.js` extension - for language statistics.