<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 16, 2022 at 2:46 PM Brad Wyble <<a href="mailto:bwyble@gmail.com">bwyble@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Tsvi you wrote:</div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="color:rgb(0,0,0)"> </div><div style="color:rgb(0,0,0)"> For example in cognitive psychology there is a rich literature on salience (which again is a bit different from salience in the neural network community).  Salience is a dynamic process which determines how well a certain input or input feature is processed. Salience changes in the brain depending on what other inputs or features are concurrently present or what the person is instructed to focus on.  There is very little appreciation, integration or implementation of these findings in neural networks, yet salience plays a factor in every recognition decision and modality including smell and touch.</div><div style="color:rgb(0,0,0)"><br></div></div></blockquote><div><br></div><div>I'm having trouble understanding what you mean by this, since computational modelling of salience is a major thrust of computer vision.  Itti Koch & Niebur (1998) has been cited 13,000 times and there are hundreds of papers that have elaborated on this ANN approach to salience computation in vision.  Is this not what you're asking for?  If not, what am I misunderstanding?</div></div></div></blockquote><div><br></div><div>That is a very interesting question and I would love to know more about the reconciliation of the two views. From what I understand, saliency in cognitive science is dependent on both 1) the scene represented by pixels (or other sensors) and 2) the state of mind of the perceiver (focus, goal, memory, etc.). Whereas the current paradigm in computer vision seems to me that perception is bottom up, the "true" salience of various image parts are a function of the image, and the goal is to learn it from examples. Furthermore, it seems to me that there is a consensus that salience detection is pre-inferential, so it cannot be learned in the classical supervised way: to select and label the data to learn salience, one would need to have the very faculty that determines salience, leading to a loop.</div><div><br></div><div>I'm very cautious on all this since it's far from my main expertise, so my aim is to ask for information rather than to state anything with certainty. I'm reading all these discussions with a lot of interest, I find that this channel has a space between twitter and formal scientific papers.<br></div><div><br></div><div>Best regards,</div><div>Balazs<br></div><div><br></div><div> <br></div></div></div>